Ostriches and AGI risks: four transformations needed

26 February 2023

Ostriches and AGI risks: four transformations needed

Filed under: AGI, risks, Singularity, Singularity Principles — Tags: AGI, The Singularity Principles — David Wood @ 12:48 am

I confess to having been pretty despondent at various times over the last few days.

The context: increased discussions on social media triggered by recent claims about AGI risk – such as I covered in my previous blogpost.

The cause of my despondency: I’ve seen far too many examples of people with scant knowledge expressing themselves with unwarranted pride and self-certainty.

I call these people the AGI ostriches.

It’s impossible for AGI to exist, one of these ostriches squealed. The probability that AGI can exist is zero.

Anyone concerned about AGI risks, another opined, fails to understand anything about AI, and has just got their ideas from Hollywood or 1950s science fiction.

Yet another claimed: Anything that AGI does in the world will be the inscrutable cosmic will of the universe, so we humans shouldn’t try to change its direction.

Just keep your hand by the off switch, thundered another. Any misbehaving AGI can easily be shut down. Problem solved! You didn’t think of that, did you?

Don’t give the robots any legs, shrieked yet another. Problem solved! You didn’t think of that, did you? You fool!

It’s not the ignorance that depressed me. It was the lack of interest shown by the AGI ostriches regarding alternative possibilities.

I had tried to engage some of the ostriches in conversation. Try looking at things this way, I asked. Not interested, came the answer. Discussions on social media never change any minds, so I’m not going to reply to you.

Click on this link to read a helpful analysis, I suggested. No need, came the answer. Nothing you have written could possibly be relevant.

And the ostriches rejoiced in their wilful blinkeredness. There’s no need to look in that direction, they said. Keep wearing the blindfolds!

(The following image is by the Midjourney AI.)

But my purpose in writing this blogpost isn’t to complain about individual ostriches.

Nor is my purpose to lament the near-fatal flaws in human nature, including our many cognitive biases, our emotional self-sabotage, and our perverse ideological loyalties.

Instead, my remarks will proceed in a different direction. What most needs to change isn’t the ostriches.

It’s the community of people who want to raise awareness of the catastrophic risks of AGI.

That includes me.

On reflection, we’re doing four things wrong. Four transformations are needed, urgently.

Without these changes taking place, it won’t be surprising if the ostriches continue to behave so perversely.

(1) Stop tolerating the Singularity Shadow

When they briefly take off their blindfolds, and take a quick peak into the discussions about AGI, ostriches often notice claims that are, in fact, unwarranted.

These claims confuse matters. They are overconfident claims about what can be expected about the advent of AGI, also known as the Technological Singularity. These claims form part of what I call the Singularity Shadow.

There are seven components in the Singularity Shadow:

Singularity timescale determinism
Singularity outcome determinism
Singularity hyping
Singularity risk complacency
Singularity term overloading
Singularity anti-regulation fundamentalism
Singularity preoccupation

If you’ve not come across the concept before, here’s a video all about it:

Or you can read this chapter from The Singularity Principles on the concept: “The Singularity Shadow”.

People who (like me) point out the dangers of badly designed AGI often too easily make alliances with people in the Singularity Shadow. After all, both groups of people:

Believe that AGI is possible
Believe that AGI might happen soon
Believe that AGI is likely to be cause an unprecedented transformation in the human condition.

But the Singularity Shadow causes far too much trouble. It is time to stop being tolerant of its various confusions, wishful thinking, and distortions.

To be clear, I’m not criticising the concept of the Singularity. Far from it. Indeed, I consider myself a singularitarian, with the meaning I explain here. I look forward to more and more people similarly adopting this same stance.

It’s the distortions of that stance that now need to be countered. We must put our own house in order. Sharply.

Otherwise the ostriches will continue to be confused.

(2) Clarify the credible risk pathways

The AI paperclip maximiser has had its day. It needs to be retired.

Likewise the cancer-solving AI that solves cancer by, perversely, killing everyone on the planet.

Likewise the AI that “rescues” a woman from a burning building by hurling her out of the 20th floor window.

In the past, these thought experiments all helped the discussion about AGI risks, among people who were able to see the connections between these “abstract” examples and more complicated real-world scenarios.

But as more of the general public shows an interest in the possibilities of advanced AI, we urgently need a better set of examples. Explained, not by mathematics, nor by cartoonish simplifications, but in plain everyday language.

I’ve tried to offer some examples, for example in the section “Examples of dangers with uncontrollable AI” in the chapter “The AI Control Problem” of my book The Singularity Principles.

But it seems these scenarios still fail to convince. The ostriches find themselves bemused. Oh, that wouldn’t happen, they say.

So this needs more work. As soon as possible.

I anticipate starting from themes about which even the most empty-headed ostrich occasionally worries:

The prospects of an arms race involving lethal autonomous weapons systems
The risks from malware that runs beyond the control of the people who originally released it
The dangers of geoengineering systems that seek to manipulate the global climate
The “gain of function” research which can create ultra-dangerous pathogens
The side-effects of massive corporations which give priority to incentives such as “increase click-through”
The escalation in hatred stirred up by automated trolls with more ingenious “fake social media”

On top of these starting points, the scenarios I envision mix in AI systems with increasing power and increasing autonomy – AI systems which are, however, incompletely understood by the people who deploy them, and which might manifest terrible bugs in unexpected circumstances. (After all, AIs include software, and software generally contains bugs.)

If there’s not already a prize competition to encourage clearer communication of such risk scenarios, in ways that uphold credibility as well as comprehensibility, there should be!

(3) Clarify credible solution pathways

Even more important than clarifying the AGI risk scenarios is to clarify some credible pathways to managing these risks.

Without seeing such solutions, ostriches go into an internal negative feedback loop. They think to themselves as follows:

Any possible solution to AGI risks seems unlikely to be successful
Any possible solution to AGI risks seems likely to have bad consequences in its own right
These thoughts are too horrible to contemplate
Therefore we had better believe the AGI risks aren’t actually real
Therefore anyone who makes AGI risks seem real needs to be silenced, ridiculed, or mocked.

Just as we need better communication of AGI risk scenarios, we need better communication of positive examples that are relevant to potential solutions:

Examples of when society collaborated to overcome huge problems which initially seemed impossible
Successful actions against the tolerance of drunk drivers, against dangerous features in car design, against the industrial pollutants which caused acid rain, and against the chemicals which depleted the ozone layer
Successful actions by governments to limit the powers of corporate monopolies
The de-escalation by Ronald Reagan and Mikhail Gorbachev of the terrifying nuclear arms race between the USA and the USSR.

But we also need to make it clearer how AGI risks can be addressed in practice. This includes a better understanding of:

Options for AIs that are explainable and interpretable – with the aid of trusted tools built from narrow AI
How AI systems can be designed to be free from the unexpected “emergence” of new properties or subgoals
How trusted monitoring can be built into key parts of our infrastructure, to provide early warnings of potential AI-induced catastrophic failures
How powerful simulation environments can be created to explore potential catastrophic AI failure modes (and solutions to these issues) in the safety of a virtual model
How international agreements can be built up, initially from a “coalition of the willing”, to impose powerful penalties in cases when AI is developed or deployed in ways that violate agreed standards
How research into AGI safety can be managed much more effectively, worldwide, than is presently the case.

Again, as needed, significant prizes should be established to accelerate breakthroughs in all these areas.

(4) Divide and conquer

The final transformation needed is to divide up the overall huge problem of AGI safety into more manageable chunks.

What I’ve covered above already suggests a number of vitally important sub-projects.

Specifically, it is surely worth having separate teams tasked with investigating, with the utmost seriousness, a range of potential solutions for the complications that advanced AI brings to each of the following:

The prospects of an arms race involving lethal autonomous weapons systems
The risks from malware that runs beyond the control of the people who originally released it
The dangers of geoengineering systems that seek to manipulate the global climate
The “gain of function” research which can create ultra-dangerous pathogens
The side-effects of massive corporations which give priority to incentives such as “increase click-through”
The escalation in hatred stirred up by automated trolls with more ingenious “fake social media”

(Yes, these are the same six scenarios for catastrophic AI risk that I listed in section (2) earlier.)

Rather than trying to “boil the entire AGI ocean”, these projects each appear to require slightly less boiling.

Once candidate solutions have been developed for one or more of these risk scenarios, the outputs from the different teams can be compared with each other.

What else should be added to the lists above?

Comments (2)

2 Comments »

I think you’ve called these people perfectly, David – many AGI fans find thoughts of the scale of what might go wrong are just too horrible to contemplate. So they ridicule the idea, and that lets them sleep at night, and continue to be AGI fans.

However, there’s another large slice of AGI fans (ISTM) who *can* see the scale of what might go wrong but, being technologists /futurists /utopians /some cross-section thereof, *have* to tell themselves that, since controls, laws and so on *could* be enough to mitigate the danger, that we just have to find ways to get these controls and laws.

And when people like me suggest that other actions are very likely required because, given how Humanity is currently set up and how Corporations and Countries and Governments work, there’s likely only a tiny chance of such controls and laws *actually* happening (let alone *working*!) – well, to sleep at night, and to continue to be AGI fans, it’s *this* that such people find “too horrible to contemplate”, and they have to simply insist that such laws and controls *must* be possible.

Perhaps you might look at such people next? 😉 Although I do fear that you may be in this camp, and that you perhaps start from the faith that *such laws and controls *must* be possible, we *must* find a way to have them*.

Gary Marcus today sounds like he’s somewhere in between us: today he suggests governments ban the entire AI free-for-all right now, and just allow carefully-controlled AI *Research*, with monster penalties for breaking the rules. (Kinda like when new drugs are developed: slowly, slowly.)

Fine by me, as a start. But if and when such government control fails to happen, David, as I strongly suspect, no matter how many people appeal for it – and the supertanker of the giant Capitalist Software Industry continues on its course competing to create whatever it likes (as it always has), and continues its political donations to ensure it’s allowed to (cough) “self-regulate” – I trust you’ll come round at some point to Plan B, which is that we stop out-of-control progress towards AGI *by whatever means possible*.

Is that something you could sign up for? – appeal to replace the free-for-all with government-licensed research only, and (if&)when that fails to happen, appeal for software people to throw spanners in the works and save us all instead? 😉

PS hope you don’t mind if I thank you for writing such thought-provoking, informed and “calm” (“controlled”? can’t find the right word) stuff on such a topic. I’m only bothering you with “my 2p” for that reason (and do feel free to ignore it, it’s only 2p.) God Almighty, the people on Facebook/Twitter…

Comment by Nikos Helianthus — 26 February 2023 @ 7:15 pm

Reply
Excellent, well-reasoned post David. Nothing frustrates me more than those who claim that either AGI will never be arise, or that it is still decades off – as if that means we don’t need to even start considering addressing the risks in a real, focused, effective manner. While I greatly appreciate the fact that you are working so hard to get this information out there, I personally am very pessimistic about the chances of stopping what I see coming. And unlike the ostriches, I think we are VERY close to true AGE and that indeed, it has already been achieved in some limited instances. I am already trying to figure out how to deal with the huge “We told you so!” that folks like us are soon going to be living with, and wondering which outcome we will see. I have some obscure theories myself that give me hope, and I don’t think we’ll have long to wait to see if those theories hold any water or not. Great article David, thank you!

Comment by Perpetual Mystic — 27 February 2023 @ 3:56 pm

Reply

RSS feed for comments on this post. TrackBack URI

dw2

26 February 2023