Nuclear-level catastrophe: four responses

23 February 2023

Nuclear-level catastrophe: four responses

Filed under: AGI, risks, Singularity Principles — Tags: AGI, Ben Goertzel, Eliezer Yudkowsky, London Futurists, London Futurists Podcast, The Singularity Principles — David Wood @ 2:11 pm

36% of respondents agree that it is plausible that AI could produce catastrophic outcomes in this century, on the level of all-out nuclear war.

That’s 36% of a rather special group of people. People who replied to this survey needed to meet the criterion of being a named author on at least two papers published in the last three years in accredited journals in the field of Computational Linguistics (CL) – the field sometimes also known as NLP (Natural Language Processing).

The survey took place in May and June 2022. 327 complete responses were received, by people matching the criteria.

A full report on this survey (31 pages) is available here (PDF).

Here’s a screenshot from page 10 of the report, illustrating the answers to questions about Artificial General Intelligence (AGI):

You can see the responses to question 3-4. 36% of the respondents either “agreed” or “weakly agreed” with the statement that

It is plausible that decisions made by AI or machine learning systems could cause a catastrophe this century that is at least as bad as an all-out nuclear war.

That statistic is a useful backdrop to discussions stirred up in the last few days by a video interview given by polymath autodidact and long-time AGI risk researcher Eliezer Yudkowsky:

The publishers of that video chose the eye-catching title “we’re all gonna die”.

If you don’t want to spend 90 minutes watching that video – or if you are personally alienated by Eliezer’s communication style – here’s a useful twitter thread summary by Liron Shapira:

Hey what if AI is going to literally slaughter every living creature on this planet in the next 3 years?

Watch @ESYudkowsky’s new interview on @BanklessHQ and see why that's not even a joke 🤯😵https://t.co/Yk8CKHLwVE

🧵 Here are my notes and abridged clips:
— Liron Shapira (@liron) February 21, 2023

In contrast to the question posed in the NLP survey I mentioned earlier, Eliezer isn’t thinking about “outcomes of AGI in this century“. His timescales are much shorter. His “ballpark estimate” for the time before AGI arrives is “3-15 years”.

So, doctor, how long do we have before superintelligent AGI?

Eliezer's ballpark estimate is 3-15 years.

But he points out that even top researchers can't necessarily distinguish whether the timeline of a future technological breakthrough will be a couple years, or many decades. pic.twitter.com/yCy5moqpcG
— Liron Shapira (@liron) February 21, 2023

How are people reacting to this sombre prediction?

More generally, what responses are there to the statistic that, as quoted above,

36% of respondents agree that it is plausible that AI could produce catastrophic outcomes in this century, on the level of all-out nuclear war.

I’ve seen a lot of different reactions. They break down into four groups: denial, sabotage, trust, and hustle.

1. Denial

One example of denial is this claim: We’re nowhere near an understanding the magic of human minds. Therefore there’s no chance that engineers are going to duplicate that magic in artificial systems.

I have two counters:

The risks of AGI arise, not because the AI may somehow become sentient, and take on the unpleasant aspects of alpha male human nature. Rather, the risks arise from systems that operate beyond our understanding and outside our control, and which may end up pursuing objectives different from the ones we thought (or wished) we had programmed into them
Many systems have been created over the decades without the underlying science being fully understood. Steam engines predated the laws of thermodynamics. More recently, LLMs (Large Language Model AIs) have demonstrated aspects of intelligence that the designers of these systems had not anticipated. In the same way, AIs with some extra features may unexpectedly tip over into greater general intelligence.

Another example of denial: Some very smart people say they don’t believe that AGI poses risks. Therefore we don’t need to pay any more attention to this stupid idea.

My counters:

The mere fact that someone very smart asserts an idea – likely outside of their own field of special expertise – does not confirm the idea is correct
None of these purported objections to the possibility of AGI risk holds water (for a longer discussion, see my book The Singularity Principles).

Digging further into various online discussion threads, I caught the impression that what was motivating some of the denial was often a terrible fear. The people loudly proclaiming their denial were trying to cope with depression. The thought of potential human extinction within just 3-15 years was simply too dreadful for them to contemplate.

It’s similar to how people sometimes cope with the death of someone dear to them. There’s a chance my dear friend has now been reunited in an afterlife with their beloved grandparents, they whisper to themselves. Or, It’s sweet and honourable to die for your country: this death was a glorious sacrifice. And then woe betide any uppity humanist who dares to suggests there is no afterlife, or that patriotism is the last refuge of a scoundrel!

Likewise, woe betide any uppity AI risk researcher who dares to suggest that AGI might not be so benign after all! Deny! Deny!! Deny!!!

(For more on this line of thinking, see my short chapter “The Denial of the Singularity” in The Singularity Principles.)

A different motivation for denial is the belief that any sufficient “cure” to the risk of AGI catastrophe would be worse than the risk it was trying to address. This line of thinking goes as follows:

A solution to AGI risk will involve pervasive monitoring and widespread restrictions
That monitoring and restrictions will only be possible if an autocratic world government is put in place
Any autocratic world government would be absolutely terrible
Therefore, the risk of AGI can’t be that bad after all.

I’ll come back later to the flaws in that particular argument. (In the meantime, see if you can spot what’s wrong.)

2. Sabotage

In the video interview, Eliezer made one suggestion for avoiding AGI catastrophe: Destroy all the GPU server farms.

These vast collections of GPUs (a special kind of computing chip) are what enables the training of many types of AI. If these chips were all put out of action, it would delay the arrival of AGI, giving humanity more time to work out a better solution to coexisting with AGI.

Another suggestion Eliezer makes is that the superbright people who are currently working flat out to increase the capabilities of their AI systems should be paid large amounts of money to do nothing. They could lounge about on a beach all day, and still earn more money than they are currently receiving from OpenAI, DeepMind, or whoever is employing them. Once again, that would slow down the emergence of AGI, and buy humanity more time.

I’ve seen other similar suggestions online, which I won’t repeat here, since they come close to acts of terrorism.

All these suggestions have in common: let’s find ways to stop the development of AI in its tracks, all across the world. Companies should be stopped in their tracks. Shadowy military research groups should be stopped in their tracks. Open source hackers should be stopped in their tracks. North Korean ransomware hackers must be stopped in their tracks.

This isn’t just a suggestion that specific AI developments should be halted, namely those with an explicit target of creating AGI. Instead, it recognises that the creation of AGI might occur via unexpected routes. Improving the performance of various narrow AI systems, including fact-checking, or emotion recognition, or online request interchange marketplaces – any of these might push the collection of AI modules over the critical threshold. Mixing metaphors, AI could go nuclear.

Shutting down all these research activities seems a very tall order. Especially since many of the people who are currently working flat out to increase AI capabilities are motivated, not by money, but by the vision that better AI could do a tremendous amount of good in the world: curing cancer, solving nuclear fusion, improving agriculture by leaps and bounds, and so on. They’re not going to be easy to persuade to change course. For them, there’s a lot more at stake than money.

I have more to say about the question “To AGI or not AGI” in this chapter. In short, I’m deeply sceptical.

In response, a would-be saboteur may admit that their chances of success are low. But what do you suggest instead, they will ask.

Read on.

3. Trust

Let’s start again from the statistic that 36% of the NLP survey respondents agreed, with varying degrees of confidence, that advanced AI could trigger a catastrophe as bad as an all-out nuclear war some time this century.

It’s a pity that the question wasn’t asked with shorter timescales. Comparing the chances of an AI-induced global catastrophe in the next 15 years with one in the next 85 years:

The longer timescale makes it more likely that AGI will be developed
The shorter timescale makes it more likely that AGI safety research will still be at a primitive (deeply ineffective) level.

Even since the date of the survey – May and June 2022 – many forecasters have shortened their estimates of the likely timeline to the arrival of AGI.

So, for the sake of the argument, let’s suppose that the risk of an AI-induced global catastrophe happening by 2038 (15 years from now) is 1/10.

There are two ways to react to this:

1/10 is fine odds. I feel lucky. What’s more, there are plenty of reasons we ought to feel lucky about
1/10 is terrible odds. That’s far too high a risk to accept. We need to hustle to find ways to change these odds in our favour.

I’ll come to the hustle response in a moment. But let’s first consider the trust response.

A good example is in this comment from SingularityNET founder and CEO Ben Goertzel:

Eliezer is a very serious thinker on these matters and was the core source of most of the ideas in Nick Bostrom’s influential book Superintelligence. But ever since I met him, and first debated these issues with him, back in 2000 I have felt he had a somewhat narrow view of humanity and the universe in general.

There are currents of love and wisdom in our world that he is not considering and seems to be mostly unaware of, and that we can tap into by creating self reflective compassionate AGIs and doing good loving works together with them.

In short, rather than fearing humanity, we should learn to trust humanity. Rather than fearing what AGI will do, we should trust that AGI can do wonderful things.

You can find a much longer version of Ben’s views in the review he wrote back in 2015 of Superintelligence. It’s well worth reading.

What are the grounds for hope? Humanity has come through major challenges in the past. Even though the scale of the challenge is more daunting on this occasion, there are also more people contributing ideas and inspiration than before. AI is more accessible than nuclear weapons, which increases the danger level, but AI could also be deployed as part of the solution, rather than just being a threat.

Another idea is that if an AI looks around for data teaching it which values to respect and uphold, it will find plenty of positive examples in great human literature. OK, that literature also includes lots of treachery, and different moral codes often conflict, but a wise AGI should be able to see through all these conclusions to discern the importance of defending human flourishing. OK, much of AI training at the moment focuses on deception, manipulation, enticement, and surveillance, but, again, we can hope that a wise AGI will set aside those nastier aspects of human behaviour. Rather than aping trolls or clickbait, we can hope that AGI will echo the better angels of human nature.

It’s also possible that, just as DeepMind’s AlphaGo Zero worked out by itself, without any human input, superior strategies at the board games Go and Chess, a future AI might work out, by itself, the principles of universal morality. (That’s assuming such principles exist.)

We would still have to hope, in such a case, that the AI that worked out the principles of universal morality would decide to follow these principles, rather than having some alternative (alien) ways of thinking.

But surely hope is better than despair?

To quote Ben Goertzel again:

Despondence is unwarranted and unproductive. We need to focus on optimistically maximizing odds of a wildly beneficial Singularity together.

My view is the same as expressed by Berkeley professor of AI Stuart Russell, in part of a lengthy exchange with Steven Pinker on the subject of AGI risks:

The meta argument is that if we don’t talk about the failure modes, we won’t be able to address them…

Just like in nuclear safety, it’s not against the rules to raise possible failure modes like, what if this molten sodium that you’re proposing should flow around all these pipes? What if it ever came into contact with the water that’s on the turbine side of the system? Wouldn’t you have a massive explosion which could rip off the containment and so on? That’s not exactly what happened in Chernobyl, but not so dissimilar…

The idea that we could solve that problem without even mentioning it, without even talking about it and without even pointing out why it’s difficult and why it’s important, that’s not the culture of safety. That’s sort of more like the culture of the communist party committee in Chernobyl, that simply continued to assert that nothing bad was happening.

(By the way, my sympathies in that long discussion, when it comes to AGI risk, are approximately 100.0% with Russell and approximately 0.0% with Pinker.)

4. Hustle

The story so far:

The risks are real (though estimates of their probability vary)
Some possible “solutions” to the risks might produce results that are, by some calculations, worse than letting AGI take its own course
If we want to improve our odds of survival – and, indeed, for humanity to reach something like a sustainable superabundance with the assistance of advanced AIs – we need to be able to take a clear, candid view of the risks facing us
Being naïve about the dangers we face is unlikely to be the best way forward
Since time may be short, the time to press for better answers is now
We shouldn’t despair. We should hustle.

Some ways in which research could generate useful new insight relatively quickly:

When the NLP survey respondents expressed their views, what reasons did they have for disagreeing with the statement? And what reasons did they have for agreeing with it? And how do these reasons stand up, in the cold light of a clear analysis? (In other words, rather than a one-time survey, an iterative Delphi survey should lead to deeper understanding.)
Why have the various AI safety initiatives formed in the wake of the Puerto Rico and Asilomar conferences of 2015 and 2017 fallen so far short of expectations?
Which descriptions of potential catastrophic AI failure modes are most likely to change the minds of those critics who currently like to shrug off failure scenarios as “unrealistic” or “Hollywood fantasy”?

Constructively, I invite conversation on the strengths and weaknesses of the 21 Singularity Principles that I have suggested as contributing to improving the chances of beneficial AGI outcomes.

For example:

Can we identify “middle ways” that include important elements of global monitoring and auditing of AI systems, without collapsing into autocratic global government?
Can we improve the interpretability and explainability of advanced AI systems (perhaps with the help of trusted narrow AI tools), to diminish the risks of these systems unexpectedly behaving in ways their designers failed to anticipate?
Can we deepen our understanding of the ways new capabilities “emerge” in advanced AI systems, with a particular focus on preventing the emergence of alternative goals?

I also believe we should explore more fully the possibility that an AGI will converge on a set of universal values, independent of whatever training we provide it – and, moreover, the possibility that these values will include upholding human flourishing.

And despite me saying just now that these values would be “independent of whatever training we provide”, is there, nevertheless, a way for us to tilt the landscape so that the AGI is more likely to reach and respect these conclusions?

Postscript

To join me in “camp hustle”, visit Future Surge, which is the activist wing of London Futurists.

If you’re interested in the ideas of my book The Singularity Principles, here’s a podcast episode in which Calum Chace and I discuss some of these ideas more fully.

In a subsequent episode of our podcast, Calum and I took another look at the same topics, this time with Millennium Project Executive Director Jerome Glenn: “Governing the transition to AGI”.

Comments (9)

9 Comments »

Combining (via APi’s / webservices / some new framework) Ai’s will most likely produce something akin to AGI v0.1
It will be akin to the spark of life; ‘was it during conception’ (of the desire to link AIs), or, ‘was it some debatable point in time during pregnancy / gestation…’?
Who knows, but, such combinations will undoubtedly be more than the sum of their parts in ways we have not yet conceived, in the same way I doubt ‘JANET’ and other such networks where ever proposed to be the backbone of the infosphere we currently rely upon for education, trade, social commentary AND waging war.
Attempting to create an AGi (like all coding projects) requires a statement of needs / functionality, hence why I doubt it will be created in that way.
AGi v2.0 ??? probably lots of iterations of the above, in the same way we used to code our own sub-routines but now use libraries of others, AGi v2.0 will most likely be formed from many ‘sub-routines’ that people haven’t conceived yet.
There are many genies already, there will more many many more, too many for containment by human bottles and their silly analog corks.
(second submission of this comment, -with capitalisation- what happened to the first ???)

Comment by jamesb692 — 23 February 2023 @ 5:24 pm

Reply
- Hi James,
  
  I’m not sure how tightly constrained each AI development project will be to a pre-written statement of needs/functionality.
  The various new MLMs have often contained functionality that were no part of any such statement, since they emerged without the designers anticipating them.
  
  Moreover, projects often have a way of developing their own momentum. That’s something I recommend developers should guard against, in the first set of the Singularity Principles: https://transpolitica.org/projects/the-singularity-principles/in-depth/goals-and-outcomes/
  
  Re “what happened to the first [submission of your comment]” – I only saw this second one. There’s no sign of the earlier one on the system anywhere, sorry!
  
  Comment by David Wood — 23 February 2023 @ 6:01 pm
  
  Reply
“A Software Engineer may not injure Humanity or, through inaction, allow Humanity to come to harm.”
On CS courses, new software engineers need to be taught this First Law, taught that software could be existentially dangerous to humanity, taught that respecting this Law is infinitely more important (well obviously) than respecting any directors’ instructions to maximise their shareholders’ short-term profits.
(And be given examples.)
Your single most important function as you enter the modern software world is to help destroy any software projects that are dangerous to Humanity. It’s the equivalent of defusing a nuclear escalation. They will build statues of you. Do Not “Follow Orders” and be part of destroying Humanity. Ever since we put the Nazis on trial, “following orders” is no defence.

Comment by Nikos Helios — 24 February 2023 @ 9:02 am

Reply
- Hi Nikos,
  
  I have a lot of sympathy for what you propose. That’s why, for example, two of the 21 Singularity Principles that I advocate are “Insist on accountability” and “Penalise disinformation”, https://transpolitica.org/projects/the-singularity-principles/in-depth/responsible-development/
  
  However, there are some well-known issues with proposals based on Asimov’s Laws of Robotics. People can differ on their views of “what counts as harm”. And the “through inaction” clause means that sometimes the software engineer should hurry up with the development of a product, since the absence of that product is causing harm in the world. See the discussion in https://transpolitica.org/projects/the-singularity-principles/risks-and-benefits/the-alignment-problem/
  
  Comment by David Wood — 24 February 2023 @ 1:10 pm
  
  Reply
  - Then let me rephrase, widening the scope:
    
    Anyone who finds themself involved with an activity or project which they think has a not insignificant chance of being existentially dangerous to Humanity should not argue the toss, but should do whatever they can to sabotage it.
    
    *Because it doesn’t matter if they’re wrong.* Imagine that ten different people sabotage ten different projects, and we only find out twenty years later that actually only one of the projects would have been an existential threat to humanity. It doesn’t matter that nine projects were sabotaged which in hindsight could have lived. What matters is that Humanity Still Exists.
    
    Or do you disagree with that? (So, if you find yourself on a project which you think may lead to an existential threat to Humanity you should… campaign for better safeguards? Thinks about shareholder value? FFS! You should try to kill that project, duh!)
    
    Because Business As Usual – well-meaning people trying to suggest improvements and safeguards while Capitalist (and State-Capitalist) companies fight to be First Come Hell Or High Water – is a guarantee that Humanity screws this up.
    
    Comment by Nikos Helios — 25 February 2023 @ 12:22 am
    
    Reply
    - Hi again NH
      
      It’s not all about Humanity-as-a-whole. It’s also about individual humans.
      
      We shouldn’t prioritise “avoid any risk whatsoever of humanity ceasing to exist” over every other consideration.
      
      For example, someone might suggest that killing everyone with an IQ over 85 is a good way to prevent the creation of AGI. Would you endorse that “solution”?
      
      “Campaigning for better safeguards” needn’t be as lame as you imply. Progress on health and safety standards is something we humans should celebrate. Positive examples include activism against acid rain, against the emissions of CFCs, against drunk drivers, against cars without safety features, against pollution in waters.
      
      Do capitalist companies always get to dictate terms? The examples of anti-trust (anti-monopoly) legislation indicate otherwise.
      
      Has that been enough? By no means! Is that all we need, to counter the risks from AGI? Not at all. But we shouldn’t dismiss these ideas. Instead, we can, and should, find ways of making it much more effective.
      
      Comment by David Wood — 25 February 2023 @ 10:38 am
      
      Reply
      - Then by all means persevere with trying to improve safeguards, and campaigning for government-level actions.
        
        Mean time, a rather different question: if a person finds themselves engaged on a project which they think has a not insignificant chance of being existentially dangerous to Humanity, and doing so in a timeframe where campaigning for safeguards etc isn’t going to cut any ice, what should they do?
        
        If it helps, here’s a half-way-house example: imagine you’re working on a software project and you suddenly find yourself thinking that the CEO may very well be about to use this software to take over your country, replacing democracy with their own private dictatorship. Do you (1) campaign for safeguards on the use of said software, or (2) disable it any way you can and the hell with shareholders?
        
        Well, a threat to Humanity is way bigger than such a threat to your country.
        
        Obviously, you should disable such projects, right, if you think they’ve got to that point?
        
        PS Disabling projects (software or otherwise) is a different type of action to killing everyone with a high IQ.
        
        Comment by Nicola Hawley — 25 February 2023 @ 1:00 pm
        
        Reply
        
        I’m with you on your half-way house example, NH.
        
        But let me offer a different example. Someone says: Vladimir Putin might unleash a nuclear war and destroy all of humanity. Therefore we shouldn’t do anything that might antagonise him. So whatever he wants in Ukraine (or Moldova or Latvia or Poland), just give it to him! We are better off red than dead. The future of all humanity is at stake!
        
        I remember people in the UK making very similar arguments in the early 1980s about the presence of Soviet SS-20 weapons in Europe. Just give Brezhnev / Andropov/ Chernenko whatever they want, if the crunch comes, people said. We are better red than dead. Thankfully, in this case, Ronald Reagan and Mikhail Gorbachev eventually found a different way to defuse the risks.
        
        Comment by David Wood — 25 February 2023 @ 1:16 pm
        
        Reply
        
        David,
        
        I like your Cold War example, it made me think. But we already had nuclear weapons in the world, and there’s a thing called Mutually Assured Destruction which greatly limits the likely use of them. If you have a nuclear-armed ideological enemy and you throw away your nuclear deterrent, expect to be invaded with maybe one nuclear explosion somewhere to guarantee your capitulation.
        
        In summary: if these weapons are possible, and if aggressive, ideologically opposed nations have them or might get them, then you better get some yourself, or get one of these muclear nations as your guarantor.
        
        And *then* you better do what Reagan and Gorbachev did! 🙂
        
        But this is the world of the rights and wrongs of world powers, world politics, world weaponry, geopolitical game theory: (a) very difficult judgement calls, (b) between a variety of horrific outcomes, (c) in a game where you truly know very little for sure.
        
        I’m talking about a situation without (a), (b) or (c) – software companies wiping out the Human Race because its technologists love making new things and its directors think there might be money in it and there’s thus market competition to be first in each new way. And said technologists realising, nearer the time, in the near future, what awesome power this new intelligence will soon wield, and how impossible it would be for Humanity to keep a lid on it for the foreseeable future (or, given “competition”, bugs and subversion, for five minutes).
        
        I’m hoping one day (soon) you’ll reconsider my question and say “Yes, at this point, any sane person should try to destroy such projects.”
        
        Actually it’s tempting, so that I can sleep at night, to conclude that *loads* of software people would *inevitably* destroy such projects. So I don’t need to worry. 🙂
        
        But then, I know software people. (And I know about uber-Capitalist owners too.)
        
        Anyway, Gary Marcus today offers us some middle-ground: ban the current AI free-for-all, and just allow carefully-controlled AI *Research*, with monster penalties for breaking the rules. Kinda like when new drugs are developed: slowly, slowly.
        
        I’d say it’s only a start. You might say it’s enough, for now. But as for how we *get* governments to legislate this… I see no way to make them act in time, and that’s part of my point. Any ideas?
        
        Comment by Nicola Helios — 26 February 2023 @ 6:53 pm
        
        Reply

RSS feed for comments on this post. TrackBack URI

dw2

23 February 2023