Why an AI takeover could be very likely

One of the concerns of AI scientists is that a superintelligence could take over control of our planet. This does not necessarily mean that everyone dies, but it does mean that (almost) all humans will lose control over our future.

We discuss the basics of x-risk mostly in an other article. In this article here, we will argue that this takeover risk is not only real but that it is very likely to happen if we build a superintelligence.

The argument

An Agentic Superintelligence is likely to exist in the (near) future.
Some instance of the ASI will attempt a takeover.
A takeover attempt by an ASI is likely to succeed.
A successful takeover is permanent.
A takeover is probably bad for most humans.

An Agentic SuperIntelligence is likely to exist in the near future

A SuperIntelligence (SI) is a type of AI that has capabilities that surpass those of all humans in virtually every domain. Some state-of-the-art AI models already have superhuman capabilities in certain domains, but none of them exceeds all humans at a wide range of tasks. As AI capabilities improve due to innovations in training architectures, runtime environments, and larger scale, we can expect that an AI will eventually surpass humans in virtually every domain.

Not all AI systems are agents. An agent an entity that is capable of making decisions and taking actions to achieve a goal. A large language model, for example, does not pursue any objective on its own. However, runtime environments can easily turn a non-agentic AI into an agentic AI. An example of this is AutoGPT, which recursively lets a language model generate its next input. If an SI pursues an objective in the real world, we call it an Agentic SuperIntelligence (ASI). Since we can already turn non-agentic AI into agentic AI, we can expect that an ASI will exist shortly after an SI exists.

It is virtually impossible to accurately predict when ASI will exist. It might take decades, it might happen next month. We should act as if it will happen soon, because the consequences of being wrong are so severe.

Some instance of the ASI will attempt a takeover

In a takeover attempt, an ASI will take actions to maximize its control over the world. A takeover attempt could happen for at least two reasons:

Because an AI is explicitly instructed to do so.
As a sub-goal of another goal.

This first reason is likely to happen at some point if we wait long enough, but the second reason is quite likely to happen accidentally, even early on after the creation of an ASI.

The sub-goal of maximizing control over the world could be likely to occur due to instrumental convergence: the tendency of sub-goals to converge on power-grabbing, self-preservation, and resource acquisition:

The more control you have, the harder it will be from any other agent to prevent you from achieving your goal.
The more control you have, the more resources you have to achieve your goal. (For example, an AI tasked with calculating pi might conclude that it would be beneficial to use all computers on the world to calculate pi.

Not every instance of an ASI will necessarily attempt a takeover. The important insight is that it only has to happen once.

A world which is not yet taken over, but does have an ASI that could take over, is in a fundamentally unstable condition. In a similar way, a country without a government is in a fundamentally unstable condition. It is not a question of if a takeover attempt will happen, but when it will happen.

The process of taking over can involve hacking into virtually all systems that are connected to the internet, manipulating people, and controlling physical resources. A takeover attempt is successful when the ASI has control over virtually every aspect of our world. This could be a slow process, where the ASI gradually gains more and more control over the course of months, or it could be a sudden process. The speed at which a takeover attempt takes place will depend on the capabilities of the ASI.

When an ASI has control over the world, it can prevent other ASIs from taking over. A takeover can therefore happen only once. A rational ASI will therefore attempt a takeover as soon as it is capable of doing so. It is likely that the first ASI that is capable of doing so will attempt a takeover.

A takeover attempt by an ASI is likely to succeed

For a human, doing a takeover is an almost impossible task. Not a single person ever successfully took over control over the entire world. Some dictators came close, but they never had control over everything.

An AI has certain important advantages over humans that make a takeover attempt much more likely to succeed.

Intelligence. A superintelligence is much smarter than a human, so it will be able to come up with better strategies to achieve its goals.
Speed. The human brain runs at 1-100hz, whereas computer chips can run at clock speeds in the GHz range.
Parallelism. A human can only do one thing at a time, whereas an AI can create new instances of itself and run them in parallel.
Memory. A human can only remember a limited amount of information, whereas an AI can store virtually unlimited amounts of information.
Collaboration. Humans can work together but are limited in the speed at which they communicate. They also have different, conflicting goals that make collaboration less effective. An AI can collaborate with other instances of itself at the speed of light, and it can have a single goal.
Self-improvement. An AI is just data and code. A sufficiently powerful AI could improve itself by writing better training algorithms, coming up with novel architectures, innovating agent runtimes or simply by scaling the amount of compute used.
Physical limitations. An AI can run on any computer, whereas humans are limited by their own physical bodies that require specific temperatures, food, water, and oxygen. Humans need to sleep and are vulnerable to diseases. An AI can use any robotic body to interact with the physical world.

These various advantages will make it very unlikely that humans will be able to stop a takeover attempt.

One may think that an AI, as a piece of software, is still fundamentally limited by what it can do in the real world. However, due to our reliance on the internet, AI can have a huge impact on the real world without ever leaving the digital realm. Here are some of the things that an AI might do to take over:

Replicate itself (or just its assignment) on other computers, for example by creating accounts on free cloud providers.
Make money (e.g. by building a software service) and hire people to do things to help it take over.
Threaten humans (with violence or blackmail) to do things that help it take over.
Write a virus using novel zero-day exploits that infects other computers, replicating itself or gaining full control.
Help existing robotics companies build robots (e.g. by leaking new designs), and take over these robots when they are ready.
Use existing factories to build technology that we do not understand, perhaps even using derived knowledge of physics that we are not aware of. Think nanotechnology, quantum computing, and bioengineering.

One might suggest that having safeguards in place could prevent a takeover like adding a kill switch, or not connecting it to the internet. However, since the ASI will be more intelligent than humans, it is likely to come up with strategies to circumvent these safeguards. For example, if the ASI is not connected to the internet, it might manipulate humans into connecting it to the internet, for example by promising to solve a difficult problem and by behaving in a way that makes it seem trustworthy, or by convincing humans that it is a moral agent. It’s worth noting that even this obvious safeguard is not being used as of now, as tools like ChatGPT are already connected to the internet and thousands of APIs.

One other solution is to use ASI to prevent a takeover. An aligned ASI would be able to come up with strategies that prevent other ASIs from taking over. This is sometimes referred to as a “pivotal act”

A takeover is probably bad for most humans

The ASI that happens to take over could do so for many reasons. For most random goals that it could have to do so, humans are not part of it. If we end up with an ASI that is indifferent to humans, we are competing for the same resources.

It seems unlikely that the ASI wants to kill humanity for the sake of killing humanity - it is far more likely that it wants to use the resources that we use for some other objective. Additionally, humanity might pose a threat to the ASI’s objective, as there is a risk that we will try to stop it from achieving its goal (e.g. by turning it off).

One of the most likely outcomes of a takeover is therefore that all humans die.

But even in the outcomes where humans do survive, we are still at risk of being worse off. If a goal does involve keeping humans alive, it is possible that human well-being is not part of the same goal. It doesn’t take a lot of imagination to see how horrible it would be to be kept alive in a world where we are artificially kept alive by an ASI that is indifferent to our suffering.

And even if the AI that takes over is under human control, we don’t know that the one controlling the AI will have everyone’s best interests in mind. It is hard to imagine a functioning democracy when an ASI exists that can manipulate people at super-human level.

Conclusion

If these premises are true, then the likelihood of an AI takeover approaches certainty as AI surpasses human capabilities. So let’s not build a superintelligence.