36% of respondents agree that it is plausible that AI could produce catastrophic outcomes in this century, on the level of all-out nuclear war.
That’s 36% of a rather special group of people. People who replied to this survey needed to meet the criterion of being a named author on at least two papers published in the last three years in accredited journals in the field of Computational Linguistics (CL) – the field sometimes also known as NLP (Natural Language Processing).
The survey took place in May and June 2022. 327 complete responses were received, by people matching the criteria.
A full report on this survey (31 pages) is available here (PDF).
Here’s a screenshot from page 10 of the report, illustrating the answers to questions about Artificial General Intelligence (AGI):
You can see the responses to question 3-4. 36% of the respondents either “agreed” or “weakly agreed” with the statement that
It is plausible that decisions made by AI or machine learning systems could cause a catastrophe this century that is at least as bad as an all-out nuclear war.
That statistic is a useful backdrop to discussions stirred up in the last few days by a video interview given by polymath autodidact and long-time AGI risk researcher Eliezer Yudkowsky:
The publishers of that video chose the eye-catching title “we’re all gonna die”.
If you don’t want to spend 90 minutes watching that video – or if you are personally alienated by Eliezer’s communication style – here’s a useful twitter thread summary by Liron Shapira:
In contrast to the question posed in the NLP survey I mentioned earlier, Eliezer isn’t thinking about “outcomes of AGI in this century“. His timescales are much shorter. His “ballpark estimate” for the time before AGI arrives is “3-15 years”.
How are people reacting to this sombre prediction?
More generally, what responses are there to the statistic that, as quoted above,
36% of respondents agree that it is plausible that AI could produce catastrophic outcomes in this century, on the level of all-out nuclear war.
I’ve seen a lot of different reactions. They break down into four groups: denial, sabotage, trust, and hustle.
1. Denial
One example of denial is this claim: We’re nowhere near an understanding the magic of human minds. Therefore there’s no chance that engineers are going to duplicate that magic in artificial systems.
I have two counters:
- The risks of AGI arise, not because the AI may somehow become sentient, and take on the unpleasant aspects of alpha male human nature. Rather, the risks arise from systems that operate beyond our understanding and outside our control, and which may end up pursuing objectives different from the ones we thought (or wished) we had programmed into them
- Many systems have been created over the decades without the underlying science being fully understood. Steam engines predated the laws of thermodynamics. More recently, LLMs (Large Language Model AIs) have demonstrated aspects of intelligence that the designers of these systems had not anticipated. In the same way, AIs with some extra features may unexpectedly tip over into greater general intelligence.
Another example of denial: Some very smart people say they don’t believe that AGI poses risks. Therefore we don’t need to pay any more attention to this stupid idea.
My counters:
- The mere fact that someone very smart asserts an idea – likely outside of their own field of special expertise – does not confirm the idea is correct
- None of these purported objections to the possibility of AGI risk holds water (for a longer discussion, see my book The Singularity Principles).
Digging further into various online discussion threads, I caught the impression that what was motivating some of the denial was often a terrible fear. The people loudly proclaiming their denial were trying to cope with depression. The thought of potential human extinction within just 3-15 years was simply too dreadful for them to contemplate.
It’s similar to how people sometimes cope with the death of someone dear to them. There’s a chance my dear friend has now been reunited in an afterlife with their beloved grandparents, they whisper to themselves. Or, It’s sweet and honourable to die for your country: this death was a glorious sacrifice. And then woe betide any uppity humanist who dares to suggests there is no afterlife, or that patriotism is the last refuge of a scoundrel!
Likewise, woe betide any uppity AI risk researcher who dares to suggest that AGI might not be so benign after all! Deny! Deny!! Deny!!!
(For more on this line of thinking, see my short chapter “The Denial of the Singularity” in The Singularity Principles.)
A different motivation for denial is the belief that any sufficient “cure” to the risk of AGI catastrophe would be worse than the risk it was trying to address. This line of thinking goes as follows:
- A solution to AGI risk will involve pervasive monitoring and widespread restrictions
- That monitoring and restrictions will only be possible if an autocratic world government is put in place
- Any autocratic world government would be absolutely terrible
- Therefore, the risk of AGI can’t be that bad after all.
I’ll come back later to the flaws in that particular argument. (In the meantime, see if you can spot what’s wrong.)
2. Sabotage
In the video interview, Eliezer made one suggestion for avoiding AGI catastrophe: Destroy all the GPU server farms.
These vast collections of GPUs (a special kind of computing chip) are what enables the training of many types of AI. If these chips were all put out of action, it would delay the arrival of AGI, giving humanity more time to work out a better solution to coexisting with AGI.
Another suggestion Eliezer makes is that the superbright people who are currently working flat out to increase the capabilities of their AI systems should be paid large amounts of money to do nothing. They could lounge about on a beach all day, and still earn more money than they are currently receiving from OpenAI, DeepMind, or whoever is employing them. Once again, that would slow down the emergence of AGI, and buy humanity more time.
I’ve seen other similar suggestions online, which I won’t repeat here, since they come close to acts of terrorism.
All these suggestions have in common: let’s find ways to stop the development of AI in its tracks, all across the world. Companies should be stopped in their tracks. Shadowy military research groups should be stopped in their tracks. Open source hackers should be stopped in their tracks. North Korean ransomware hackers must be stopped in their tracks.
This isn’t just a suggestion that specific AI developments should be halted, namely those with an explicit target of creating AGI. Instead, it recognises that the creation of AGI might occur via unexpected routes. Improving the performance of various narrow AI systems, including fact-checking, or emotion recognition, or online request interchange marketplaces – any of these might push the collection of AI modules over the critical threshold. Mixing metaphors, AI could go nuclear.
Shutting down all these research activities seems a very tall order. Especially since many of the people who are currently working flat out to increase AI capabilities are motivated, not by money, but by the vision that better AI could do a tremendous amount of good in the world: curing cancer, solving nuclear fusion, improving agriculture by leaps and bounds, and so on. They’re not going to be easy to persuade to change course. For them, there’s a lot more at stake than money.
I have more to say about the question “To AGI or not AGI” in this chapter. In short, I’m deeply sceptical.
In response, a would-be saboteur may admit that their chances of success are low. But what do you suggest instead, they will ask.
Read on.
3. Trust
Let’s start again from the statistic that 36% of the NLP survey respondents agreed, with varying degrees of confidence, that advanced AI could trigger a catastrophe as bad as an all-out nuclear war some time this century.
It’s a pity that the question wasn’t asked with shorter timescales. Comparing the chances of an AI-induced global catastrophe in the next 15 years with one in the next 85 years:
- The longer timescale makes it more likely that AGI will be developed
- The shorter timescale makes it more likely that AGI safety research will still be at a primitive (deeply ineffective) level.
Even since the date of the survey – May and June 2022 – many forecasters have shortened their estimates of the likely timeline to the arrival of AGI.
So, for the sake of the argument, let’s suppose that the risk of an AI-induced global catastrophe happening by 2038 (15 years from now) is 1/10.
There are two ways to react to this:
- 1/10 is fine odds. I feel lucky. What’s more, there are plenty of reasons we ought to feel lucky about
- 1/10 is terrible odds. That’s far too high a risk to accept. We need to hustle to find ways to change these odds in our favour.
I’ll come to the hustle response in a moment. But let’s first consider the trust response.
A good example is in this comment from SingularityNET founder and CEO Ben Goertzel:
Eliezer is a very serious thinker on these matters and was the core source of most of the ideas in Nick Bostrom’s influential book Superintelligence. But ever since I met him, and first debated these issues with him, back in 2000 I have felt he had a somewhat narrow view of humanity and the universe in general.
There are currents of love and wisdom in our world that he is not considering and seems to be mostly unaware of, and that we can tap into by creating self reflective compassionate AGIs and doing good loving works together with them.
In short, rather than fearing humanity, we should learn to trust humanity. Rather than fearing what AGI will do, we should trust that AGI can do wonderful things.
You can find a much longer version of Ben’s views in the review he wrote back in 2015 of Superintelligence. It’s well worth reading.
What are the grounds for hope? Humanity has come through major challenges in the past. Even though the scale of the challenge is more daunting on this occasion, there are also more people contributing ideas and inspiration than before. AI is more accessible than nuclear weapons, which increases the danger level, but AI could also be deployed as part of the solution, rather than just being a threat.
Another idea is that if an AI looks around for data teaching it which values to respect and uphold, it will find plenty of positive examples in great human literature. OK, that literature also includes lots of treachery, and different moral codes often conflict, but a wise AGI should be able to see through all these conclusions to discern the importance of defending human flourishing. OK, much of AI training at the moment focuses on deception, manipulation, enticement, and surveillance, but, again, we can hope that a wise AGI will set aside those nastier aspects of human behaviour. Rather than aping trolls or clickbait, we can hope that AGI will echo the better angels of human nature.
It’s also possible that, just as DeepMind’s AlphaGo Zero worked out by itself, without any human input, superior strategies at the board games Go and Chess, a future AI might work out, by itself, the principles of universal morality. (That’s assuming such principles exist.)
We would still have to hope, in such a case, that the AI that worked out the principles of universal morality would decide to follow these principles, rather than having some alternative (alien) ways of thinking.
But surely hope is better than despair?
Despondence is unwarranted and unproductive. We need to focus on optimistically maximizing odds of a wildly beneficial Singularity together.
My view is the same as expressed by Berkeley professor of AI Stuart Russell, in part of a lengthy exchange with Steven Pinker on the subject of AGI risks:
The meta argument is that if we don’t talk about the failure modes, we won’t be able to address them…
Just like in nuclear safety, it’s not against the rules to raise possible failure modes like, what if this molten sodium that you’re proposing should flow around all these pipes? What if it ever came into contact with the water that’s on the turbine side of the system? Wouldn’t you have a massive explosion which could rip off the containment and so on? That’s not exactly what happened in Chernobyl, but not so dissimilar…
The idea that we could solve that problem without even mentioning it, without even talking about it and without even pointing out why it’s difficult and why it’s important, that’s not the culture of safety. That’s sort of more like the culture of the communist party committee in Chernobyl, that simply continued to assert that nothing bad was happening.
(By the way, my sympathies in that long discussion, when it comes to AGI risk, are approximately 100.0% with Russell and approximately 0.0% with Pinker.)
4. Hustle
The story so far:
- The risks are real (though estimates of their probability vary)
- Some possible “solutions” to the risks might produce results that are, by some calculations, worse than letting AGI take its own course
- If we want to improve our odds of survival – and, indeed, for humanity to reach something like a sustainable superabundance with the assistance of advanced AIs – we need to be able to take a clear, candid view of the risks facing us
- Being naïve about the dangers we face is unlikely to be the best way forward
- Since time may be short, the time to press for better answers is now
- We shouldn’t despair. We should hustle.
Some ways in which research could generate useful new insight relatively quickly:
- When the NLP survey respondents expressed their views, what reasons did they have for disagreeing with the statement? And what reasons did they have for agreeing with it? And how do these reasons stand up, in the cold light of a clear analysis? (In other words, rather than a one-time survey, an iterative Delphi survey should lead to deeper understanding.)
- Why have the various AI safety initiatives formed in the wake of the Puerto Rico and Asilomar conferences of 2015 and 2017 fallen so far short of expectations?
- Which descriptions of potential catastrophic AI failure modes are most likely to change the minds of those critics who currently like to shrug off failure scenarios as “unrealistic” or “Hollywood fantasy”?
Constructively, I invite conversation on the strengths and weaknesses of the 21 Singularity Principles that I have suggested as contributing to improving the chances of beneficial AGI outcomes.
For example:
- Can we identify “middle ways” that include important elements of global monitoring and auditing of AI systems, without collapsing into autocratic global government?
- Can we improve the interpretability and explainability of advanced AI systems (perhaps with the help of trusted narrow AI tools), to diminish the risks of these systems unexpectedly behaving in ways their designers failed to anticipate?
- Can we deepen our understanding of the ways new capabilities “emerge” in advanced AI systems, with a particular focus on preventing the emergence of alternative goals?
I also believe we should explore more fully the possibility that an AGI will converge on a set of universal values, independent of whatever training we provide it – and, moreover, the possibility that these values will include upholding human flourishing.
And despite me saying just now that these values would be “independent of whatever training we provide”, is there, nevertheless, a way for us to tilt the landscape so that the AGI is more likely to reach and respect these conclusions?
Postscript
To join me in “camp hustle”, visit Future Surge, which is the activist wing of London Futurists.
If you’re interested in the ideas of my book The Singularity Principles, here’s a podcast episode in which Calum Chace and I discuss some of these ideas more fully.
In a subsequent episode of our podcast, Calum and I took another look at the same topics, this time with Millennium Project Executive Director Jerome Glenn: “Governing the transition to AGI”.