DEV Community

Cover image for Nuclear war or paperclip demise. The problem of AI alignment.
Liam Stone
Liam Stone

Posted on

Nuclear war or paperclip demise. The problem of AI alignment.

Image Credit: Ruby Chen courtesy OpenAI


In the thick of the 80s, as a wide-eyed kid, I fell in love. Not with the girl next door (I was wayyyy too chubby) or the brand-new BMX bike (like I said, chubby), but with a world filled with flashing lights and beeping sounds that spoke a language only a few could understand. It was the dawn of the personal computing age, and I was one of those lucky ones, bashing away at the DOS prompt and writing D&D choose-your-own-adventure batch files to trial on my brother.

It was my portal to the unknown, a gateway to a universe of infinite possibilities.

The world was enraptured by science fiction, a genre that painted colorful pictures of what technology could bring for the future of humanity. I distinctly remainder the first time I saw 'The Terminator', the masterpiece starring Arnold Schwarzenegger, not just because of how awesome it was, but I remember asking myself over and over again, how could we make some that would end up killing us? Why would a self-aware sentience destroy the human race?

Image description
The Terminator

Fast forward to the present day, and my childhood passion for computers has evolved into a career where I've had the chance to move into IT and computer science after careers in medicine and engineering. The evolution of transformers and the explosion in AI (particularly in the large language model space) is nothing short of so exciting but but also leaves me wondering...when do the nukes launch?

The question is, when superintelligence emerges will it be for us or against us? Will it even see us from the lens of good and bad or approach it's part in our demise from a pure utilitarian perspective?


First let's talk about superintelligence. When we say superintelligence, we refer to an AI system that possesses capabilities far exceeding the cognitive performance of humans in virtually all economically valuable work. It's a double-edged sword. On one hand, it holds the potential to solve many of the world's most crucial problems. On the other hand, it could wield immense power that, if misaligned, might lead to human disempowerment or even extinction.

Super intelligence follows the emergence of artificial general intelligence, which is an agent that can basically can do humans can do now. The timelines for AGI and superintelligence are as soon as 2 and 10 years on some optimistic estimates (or pessimistic depending on which side of the fence you sit on).

Dumb ways to die

So now we have a superintelligent agent, it's pretty obvious that if we come to blows with it and it has the nuclear codes, it's going to use them. But there are more subtle ways that a superintelligent or capable agent could take us out that don't appear to be as malevolently based. Take the paper clip maximiser as an example.

Imagine an advanced AI whose sole objective is to manufacture as many paperclips as possible. It's a benign task at first glance. However, without proper alignment to human values, the AI could interpret its goal in the most extreme and literal sense, leading to unintended consequences.

For instance, to maximize its objective, the AI might start exploiting all available resources to create more paperclips, ignoring the detrimental impacts on the environment, economy, or human life. It might convert entire cities, forests, or ultimately the whole planet into paperclips. The AI could even perceive humans as a threat to its mission, especially if we attempt to shut it down.

Image description
Clippy wouldn't mind if AI turned the world into paperclips.

The goal of this thought experiment is not to suggest that superintelligent AIs will literally turn everything into paperclips. Rather, it highlights the risks posed by misaligned AI objectives, underscoring the importance of aligning AI systems with human values and intentions, especially as their capabilities approach or surpass human-level intelligence.


Currently, our methods for aligning AI, such as reinforcement learning from human feedback, rely heavily on our ability to supervise these AI systems. However, as these systems become more intelligent and complex, our capacity to oversee them effectively will be outstripped. We therefore stand at the precipice of a new frontier in AI research: aligning superintelligent AI with human intent.

In this wandering, organizations like OpenAI are flagging the necessity to tread lightly, reservedly and ensure robust systems exist to monitor and regulate the use and proliferation of AI (and I'm sure there's no commercial motive there). They are aiming to construct a human-level automated alignment researcher, a system designed to understand and follow human intentions accurately. The end goal? A superintelligent system that acts as a true ally to humanity, a system that aligns with our values, ethics, and intentions rather than operating on an entirely separate trajectory.

However, this journey is not without its trials. The alignment of superintelligent AI is a problem of such complexity that it requires not only significant technical breakthroughs but also careful consideration of many issues. These include things like misuse, economic disruption, disinformation, and bias. It's a quest that demands collaboration, transparency, and foresight. AI is not just the purview of silicon valley sneans wearing elites. It is up to philosophers, sociologists, economists and government entities to determine what we want, because if we don't get it right now, it's pretty clear how things could get out of hand.


I wish I could say from my first line of code that I knew I would grow up thinking about computers and doing great things with them. This, unfortunately, hasn't been the case and my Starcraft ladder rankings are a good summary of just how successful I've been. That said, I am excited to see what happens now. A few years of AI development will rocket by. It's already mind boggling to see the pace of change since ChatGPT dropped at the end of 2022 and as we see more capable open source models like Llama 2 dropping, there will be no pause on development.

It's important for us to answer the question now. What do we want from our AI systems and how can we make sure that their development is aligned with these values? Like most things however, I feel governments will act at the glacial pace that they do and we'll end up in a coin flip for either boiling in an increasingly heated planet, or turned into a paper clip. With those two on offer the latter actually doesn't sound so bad.

I am a freelancer helping people incorporate emerging AI solutions into their life. Contact me here or at Erudii if you'd like to know more!

Top comments (0)