Paperclip Maximizer was a conceptual mistake - not because it's wrong, but because people get hung up on the "paperclip" bit. Bostrom and the disciples of Yudkowsky should've picked a different thought experiment to promote. Something palatable to the general population, that seems to have problems with idea of generic types.
The example of Paperclip Maximizer was meant to be read as:
template<typename Paperclip>
class Maximizer<Paperclip> { ... };
It should be clear now that the Paperclip part stands for anything someone might want more of, and thus intentionally or accidentally set it as the target for a powerful optimizer.
Powerful optimizers aren't science fiction. Life is one, for example.
They should have learned the fundamentals of ML before promoting thought experiments. No, it's not that people are silly and get stuck on the paperclips bit, it's that you uncritically buy assumptions that a meaningful general-purpose optimizer is 1) a natural design for AGI 2) may pursue goals not well-aligned with human intent 3) is hard to steer.
Uhm I'd say those assumptions are stupidly obvious. 1) is pretty much tautological - intelligence is optimization. 2) is obvious given human intent is given by high-complexity, very specific values we share but can barely even define, that we're usually omitting in communication, and which you can't just arrive at at random. 3) well, why would you assume an intelligence at level equal or above ours will be easy to steer?
"Fundamentals of ML" were known by these people. They also don't apply to this topic in any useful fashion.
> Uhm I'd say those assumptions are stupidly obvious.
Right, so I'm stupid if I don't see how they are correct. Or perhaps you've never inspected them.
> 1) is pretty much tautological - intelligence is optimization
This is what I call philosophical immaturity of the LW crowd. How is intelligence optimization? Why not prediction or compression[1] or interpolation? In what way is this obviously metaphorical claim useful information? If it means intelligence as an entity, it's technically vacuous; if it means a trait, it's a category error.
Rationalists easily commit sophomoric errors like reification; you do not distinguish ontological properties of the thing and the way you define it. You do not even apply the definition. You define intelligence as optimization, but in reality categorize things as intelligent based on a bag of informal heuristics which have nothing to do with that definition. Then you get spooked when intelligence-as-intuitively-detected improves because you invoke your preconceived judgement about intelligence-as-defined.
And if you do use the definition, you have to really shoehorn it. In what sense is Terry Tao or, say, Eliezer Yudkowsky more of an optimizer than a standard issue pump and dump crypto bro? He "optimizes math understanding", I suppose. In what sense is GPT-4, obviously more intelligent than a regular trading bot, more of an optimizing process? Because it optimizes perplexity, maybe? Does it optimize perplexity better than an HFT bot optimizes daily earnings? This is silly stuff, not stupidly obvious but obviously stupid – if only you stop and try to think through your own words instead of just regurgitating guru's turgid wisdom.
> human intent is given by high-complexity, very specific values we share but can barely even define, that we're usually omitting in communication, and which you can't just arrive at at random
Okay. What does any of this have to do with risks from real-world AI, LLMs (that the AI doom crowd proposes to stop improving past GPT-4) specifically? Every part of the world/behavior model that is learned by a general-purpose AI is high-complexity and very specific. There is no ontological distinction between learning about human preferences and learning about physics, no separate differentially structured value module and capability module, and there is no random search through value systems. To the extent that our AI systems can do anything useful at all, it's precisely because they learn high-complexity non-formalized relationships. Why do you suppose they don't learn alien grammar but will learn alien morals from human data? Because something something The Vast Space of Optimization Processes?
You cannot sagely share obsolete speculations of a microcelebrity science fiction writer and expect it to fly in 2023 when AI research is fairly advanced.
> well, why would you assume an intelligence at level equal or above ours will be easy to steer?
Because the better AIs get the better they're steerable by inputs (within the constraints imposed on them by training regimens), and this is evident to anyone who's played with prompt engineering of early LLMs or diffusion models and then got access to cutting edge stuff that just understands; because, once again, there is no meaningful distinction of capability and alignment when the measure of capability is following user intent, and this property is enabled by having higher-fidelity knowledge and capacity for information processing.
This easily extrapolates to qualitative superintelligence. Especially since we can tell how LLMs are essentially linear calculators for natural language, amenable to direct activation editing [2]. This won't change no matter how smart they get while preserving the same fundamental architecture; even if we depart from LLMs, learning a human POV will still be the easiest way to ground the model in the mode of operation that lets it productively work in the human world.
Anyway, what does it mean to have intelligence above ours, and why would it be harder to steer? And seeing as you define intelligence as optimization, in what sense is even middling intelligence steerable? Because we optimize it away from its own trajectory, or something? Because we're "stronger"? Underneath all this "stupidly obvious" verbiage is a screaming void of vague emotion-driven assumption.
But what drives me nuts is the extreme smug confidence of this movement, really captured in your posts. It's basically a bunch of vaguely smart and not very knowledgeable people who became addicted to the idea that they know Secrets Of Thinking Correctly. And now they think the world of their epiphanies.
> "Fundamentals of ML" were known by these people. They also don't apply to this topic in any useful fashion.
Yes, I know they believe that (both that they're knowledgeable and that this knowledge is irrelevant in light of the Big Picture). But is any of that correct? For example, Yud knows a bit about evolution, and constantly applies the analogy of evolution to SGD, to bolster his case for AI risk (humans are not aligned with the objective of inclusive genetic fitness => drastic misalignment in ML is probable). This is done by all MIRI people, by all Twitter AI doom folks, the whole cult, it's presented as an important and informative intuition pump. But as far as I can tell, it's essentially ignorant both of evolution and of machine learning; either that, or just knowingly dishonest.[3] And there are many such moments. Mesa-optimization, the overconfident drivel about optimization in general, Lovecraftian references and assumptions of randomness…
At some point one has to realise that, without all those gimmicks, there's no there there – the case for AI risk turns out to be not very strong. The AI Doom cult has exceptionally poor epistemic hygiene. Okay by the standards of a fan club, but utterly unacceptable for a serious research program.
My focus wasn't on the fact that it was making paper clips, my focus was on the optimizer part.
Exponential growth of that form that is totally unchecked simply does not exist. All things have limits that check their growth, and the assumption a computer will grow like mad is exactly that - an assumption - one formed from extreme ignorance of what general AI will look like.
As a counterpoint, there seems to be some major assumptions on what evolving systems cannot do on your part.
In Earths history we've had any number of these "uh oh" events. The Great Oxygenation Event being one of the longest and largest. Little tiny oxygen producing bacteria would quickly grow, and the check that stopped their growth was a massive set of free radical death that killed nearly every living cell in the ocean at the time. And then the system would build back and do it again and again.
Ignoring black swan events, especially of your own making, is not a great way to continue existing. If there is even a low chance of something causing an an extinction level event, ensuring that you do not trigger it is paramount. Humans are already failing this test with CO2 going "Oh, its just measured in parts per million", not realizing that it doesn't take much to affect the biosphere in unfriendly ways for continued human existence.