dw2

12 October 2023

Better concepts for a better debate about the future of AI

Filed under: AGI, philosophy, risks — Tags: , , — David Wood @ 8:16 pm

For many years, the terms “AGI” and “ASI” have done sterling work, in helping to shape constructive discussions about the future of AI.

(They are acronyms for “Artificial General Intelligence” and “Artificial Superintelligence”.)

But I think it’s now time, if not to retire these terms, but to side-line them.

In their place, we need some new concepts. Tentatively, I offer PCAI, SEMTAI, and PHUAI:

(pronounced, respectively, “pea sigh”, “sem tie”, and “foo eye” – so that they all rhyme with each other and, also, with “AGI” and “ASI”)

  • Potentially Catastrophic AI
  • Science, Engineering, and Medicine Transforming AI
  • Potentially Humanity-Usurping AI.

Rather than asking ourselves “when will AGI be created?” and “what will AGI do?” and “how long between AGI and ASI”?, it’s better to ask what I will call the essential questions about the future of AI:

  • “When is PCAI likely to be created?” and “How could we stop these potentially catastrophic AI systems from being actually catastrophic?”
  • “When is SEMTAI likely to be created?” and “How can we accelerate the advent of SEMTAI without also accelerating the advent of dangerous versions of PCAI or PHUAI?”
  • “When is PHUAI likely to be created?” and “How could we stop such an AI from actually usurping humanity into a very unhappy state?”

The future most of us can agree as being profoundly desirable, I think, is one in which SEMTAI exists and is working wonders, transforming the disciplines of science, engineering, and medicine, so that we can all more quickly gain benefits such as:

  1. Improved, reliable, low-cost treatments for cancer, dementia, aging, etc
  2. Improved, reliable, low-cost abundant green energy – such as from controlled nuclear fusion
  3. Nanotech repair engines that can undo damage, not just in our human bodies, but in the wider environment
  4. Methods to successfully revive patients who have been placed into low-temperature cryopreservation.

If we can gain these benefits without the AI systems being “fully general” or “all-round superintelligent” or “independently autonomous, with desires and goals of its own”, then so much the better.

(Such systems might also be described as “limited superintelligence” – to refer to part of a discussion that took place at Conway Hall earlier this week – involving Connor Leahy (off screen in that part of the video, speaking from the audience), Roman Yampolskiy, and myself.)

Of course, existing AI systems have already transformed some important aspects of science, engineering, and medicine – witness the likes of AlphaFold from DeepMind. But I would reserve the term SEMTAI for more powerful systems that can produce the kinds of results numbered 1-4 above.

If SEMTAI is what is desired, what we most need to beware are PCAI – potentially catastrophic AI – and PHUAI – potentially humanity-usurping AI:

  • PCAI is AI powerful enough to play a central role in the rapid deaths of, say, upward of 100 million people
  • PHUAI is AI powerful enough that it could evade human attempts to constrain it, and could take charge of the future of the planet, having little ongoing regard for the formerly prominent status of humanity.

PHUAI is a special case of PCAI, but PCAI involves a wider set of systems:

  • Systems that could cause catastrophe as the result of wilful abuse by bad actors (of which, alas, the world has far too many)
  • Systems that could cause catastrophe as a side-effect of a mistake made by a “good actor” in a hurry, taking decisions out of their depth, failing to foresee all the ramifications of their choices, pushing out products ahead of adequate testing, etc
  • Systems that could change the employment and social media scenes so quickly that terribly bad political decisions are taken as a result – with catastrophic consequences.

Talking about PCAI, SEMTAI, and PHUAI side-steps many of the conversational black holes that stymie productive discussions about the future of AI. For now on, when someone asks me a question about AGI or ASI, I will seek to turn the attention to one or more of these three new terms.

After all, the new terms are defined by the consequences (actual or potential) that would flow from these systems, not from assessments of their internal states. Therefore it will be easier to set aside questions such as

  • “How cognitively complete are these AI systems?”
  • “Do these systems truly understand what they’re talking about?”
  • “Are the emotions displayed by these systems just fake emotions or real emotions?”

These questions are philosophically interesting, but it is the list of “essential questions” that I offered above which urgently demand good answers.

Footnote: just in case some time-waster says all the above definitions are meaningless since AI doesn’t exist and isn’t a well-defined term, I’ll answer by referencing this practical definition from the open survey “Anticipating AI in 2030” (a survey to which you are all welcome to supply your own answers):

A non-biological system can be called an AI if it, by some means or other,

  • Can observe data and make predictions about future observations
  • Can determine which interventions might change outcomes in particular directions
  • Has some awareness of areas of uncertainty in its knowledge, and can devise experiments to reduce that uncertainty
  • Can learn from instances when outcomes did not match expectations, thereby improving future performance.

It might be said that LLMs (Large Language Models) fall short of some aspects of this definition. But combinations of LLMs and other computational systems do fit the bill.

Image credit: The robots in the above illustration were generated by Midjourney. The illustration is, of course, not intended to imply that the actual AIs will be embodied in robots with such an appearance. But the picture hints at the likelihood that the various types of AI will have a great deal in common, and won’t be easy to distinguish from each other. (That’s the feature of AI which is sometimes called “multipurpose”.)

2 September 2023

Bletchley Park: Seven dangerous failure modes – and how to avoid them

Filed under: Abundance, AGI, Events, leadership, London Futurists — Tags: , , — David Wood @ 7:13 am

An international AI Safety Summit is being held on 1st and 2nd November at the historic site of Bletchley Park, Buckinghamshire. It’s convened by none other than the UK’s Prime Minister, Rishi Sunak.

It’s a super opportunity for a much-needed global course correction in humanity’s relationship with the fast-improving technology of AI (Artificial Intelligence), before AI passes beyond our understanding and beyond our control.

But when we look back at the Summit in, say, two years time, will we assess it as an important step forward, or as a disappointing wasted opportunity?

(Image credit: this UK government video)

On the plus side, there are plenty of encouraging words in the UK government’s press release about the Summit:

International governments, leading AI companies and experts in research will unite for crucial talks in November on the safe development and use of frontier AI technology, as the UK Government announces Bletchley Park as the location for the UK summit.

The major global event will take place on the 1st and 2nd November to consider the risks of AI, especially at the frontier of development, and discuss how they can be mitigated through internationally coordinated action. Frontier AI models hold enormous potential to power economic growth, drive scientific progress and wider public benefits, while also posing potential safety risks if not developed responsibly.

To be hosted at Bletchley Park in Buckinghamshire, a significant location in the history of computer science development and once the home of British Enigma codebreaking – it will see coordinated action to agree a set of rapid, targeted measures for furthering safety in global AI use.

Nevertheless, I’ve seen several similar vital initiatives get side-tracked in the past. When we should be at our best, we can instead be overwhelmed by small-mindedness, by petty tribalism, and by obsessive political wheeling and dealing.

Since the stakes are so high, I’m compelled to draw attention, in advance, to seven ways in which this Summit could turn out to be a flop.

My hope is that my predictions will become self non-fulfilling.

1.) Preoccupation with easily foreseen projections of today’s AI

It’s likely that AI in just 2-3 years will possess capabilities that surprise even the most far-sighted of today’s AI developers. That’s because, as we build larger systems of interacting artificial neurons and other computational modules, the resulting systems are displaying unexpected emergent features.

Accordingly, these systems are likely to possess new ways (and perhaps radically new ways) of:

  • Observing and forecasting
  • Spying and surveilling
  • Classifying and targeting
  • Manipulating and deceiving.

But despite their enhanced capabilities, these systems may still on occasion miscalculate, hallucinate, overreach, suffer from bias, or fail in other ways – especially if they can be hacked or jail-broken.

Just because some software is super-clever, it doesn’t mean it’s free from all bugs, race conditions, design blind spots, mistuned configurations, or other defects.

What this means is that the risks and opportunities of today’s AI systems – remarkable as they are – will likely be eclipsed by the risks and opportunities of the AI systems of just a few years’ time.

A seemingly unending string of pundits are ready to drone on and on about the risks and opportunities of today’s AI systems. Yes, these conversations are important. However, if the Summit becomes preoccupied by those conversations, and gives insufficient attention to the powerful disruptive new risks and opportunities that may arise shortly afterward, it will have failed.

2.) Focusing only on innovation and happy talk

We all like to be optimistic. And we can tell lots of exciting stories about the helpful things that AI systems will be able to do in the near future.

However, we won’t be able to receive these benefits if we collectively stumble before we get there. And the complications of next generation AI systems mean that a number of dimly understood existential landmines stand in our way:

  • If the awesome powers of new AI are used for malevolent purposes by bad actors of various sorts
  • If an out-of-control race between well-meaning competitors (at either the commercial or geopolitical level) results in safety corners being cut, with disastrous consequences
  • If perverse economic or psychological incentives lead people to turn a blind eye to risks of faults in the systems they create
  • If an AI system that has an excellent design and implementation is nevertheless hacked into a dangerous alternative mode
  • If an AI system follows its own internal logic to conclusions very different from what the system designers intended (this is sometimes described as “the AI goes rogue”).

In short, too much happy talk, or imprecise attention to profound danger modes, will cause the Summit to fail.

3.) Too much virtue signalling

One of the worst aspects of meetings about the future of AI is when attendees seem to enter a kind of virtue competition, uttering pious phrases such as:

  • We believe AI must be fair”
  • We believe AI must be just”
  • We believe AI must avoid biases”
  • We believe AI must respect human values”

This is like Nero fiddling whilst Rome burns.

What the Summit must address are the very tangible threats of AI systems being involved in outcomes much worse than groups of individuals being treated badly. What’s at stake here is, potentially, the lives of hundreds of millions of people – perhaps more – depending on whether an AI-induced catastrophe occurs.

The Summit is not the place for holier-than-thou sanctimonious puff. Facilitators should make that clear to all participants.

4.) Blindness to the full upside of next generation AI

Whilst one failure mode is to underestimate the scale of catastrophic danger that next generation AI might unleash, another failure mode is to underestimate the scale of profound benefits that next generation AI could provide.

What’s within our grasp isn’t just a potential cure for, say, one type of cancer, but a potential cure for all chronic diseases, via AI-enabled therapies that will comprehensively undo the biological damage throughout our bodies that we normally call aging.

Again, what’s within our grasp isn’t just ways to be more efficient and productive at work, but ways in which AI will run the entire economy on our behalf, generating a sustainable superabundance for everyone.

Therefore, at the same time as huge resources are being marshalled on two vital tasks:

  • The creation of AI superintelligence
  • The creation of safe AI superintelligence

we should also keep clearly in mind one additional crucial task:

  • The creation of AI superbenevolence

5.) Accepting the wishful thinking of Big Tech representatives

As Upton Sinclair highlighted long ago, “It is difficult to get a man to understand something, when his salary depends on his not understanding it.”

The leadership of Big Tech companies are generally well-motivated: they want their products to deliver profound benefits to humanity.

Nevertheless, they are inevitably prone to wishful thinking. In their own minds, their companies will never make the kind of gross errors that happened at, for example, Union Carbide (Bhopal disaster), BP (Deepwater Horizon disaster), NASA (Challenger and Columbia shuttle disasters), or Boeing (737 Max disaster).

But especially in times of fierce competition (such as the competition to be the web’s preferred search tool, with all the vast advertising revenues arising), it’s all too easy for these leaders to turn a blind eye, probably without consciously realising it, to significant disaster possibilities.

Accordingly, there must be people at the Summit who are able to hold these Big Tech leaders to sustained serious account.

Agreements for “voluntary” self-monitoring of safety standards will not be sufficient!

6.) Not engaging sufficiently globally

If an advanced AI system goes wrong, it’s unlikely to impact just one country.

Given the interconnectivity of the world’s many layers of infrastructure, it’s critical that the solutions proposed by the Summit have a credible roadmap to adoption all around the world.

This is not a Summit where it will be sufficient to persuade the countries who are already “part of the choir”.

I’m no fan of diversity-for-diversity’s-sake. But on this occasion, it will be essential to transcend the usual silos.

7.) Insufficient appreciation of the positive potential of government

One of the biggest myths of the last several decades is that governments can make only a small difference, and that the biggest drivers for lasting change in the world are other forces, such as the free-market, military power, YouTube influencers, or popular religious sentiment.

On the contrary, with a wise mix of incentives and restrictions – subsidies and penalties – government can make a huge difference in the well-being of society.

Yes, national industrial policy often misfires, due to administrative incompetence. But there are better examples, where inspirational government leadership transformed the entire operating environment.

The best response to the global challenge of next generation AI will involve a new generation of international political leaders demonstrating higher skills of vision, insight, agility, collaboration, and dedication.

This is not the time for political lightweights, blowhards, chancers, or populist truth-benders.

Footnote: The questions that most need to be tabled

London Futurists is running a sequence of open surveys into scenarios for the future of AI.

Round one has concluded. Round two has just gone live (here).

I urge everyone concerned about the future of AI to take a look at that new survey, and to enter their answers and comments into the associated Google Form.

That’s a good way to gain a fuller appreciation of the scale of the issues that should be considered at Bletchley Park.

That will reduce the chance that the Summit is dominated by small-mindedness, by petty tribalism, or by politicians merely seeking a media splash. Instead, it will raise the chance that the Summit seriously addresses the civilisation-transforming nature of next generation AI.

Finally, see here for an extended analysis of a set of principles that can underpin a profoundly positive relationship between humanity and next generation AI.

24 June 2023

Agreement on AGI canary signals?

Filed under: AGI, risks — Tags: , , — David Wood @ 5:15 pm

How can we tell when a turbulent situation is about to tip over into a catastrophe?

It’s no surprise that reasonable people can disagree, ahead of time, on the level of risk in a situation. Where some people see metaphorical dragons lurking in the undergrowth, others see only minor bumps on the road ahead.

That disagreement is particularly acute, these days, regarding possible threats posed by AI with ever greater capabilities. Some people see lots of possibilities for things taking a treacherous turn, but others people assess these risks as being exaggerated or easy to handle.

In situations like this, one way to move beyond an unhelpful stand-off is to seek agreement on what would be a canary signal for the risks under discussion.

The term “canary” refers to the caged birds that human miners used to bring with them, as they worked in badly ventilated underground tunnels. Canaries have heightened sensitivity to carbon monoxide and other toxic gases. Shows of distress from these birds alerted many a miner to alter their course quickly, lest they succumb to an otherwise undetectable change in the atmosphere. Becoming engrossed in work without regularly checking the vigour of the canary could prove fatal. As for mining, so also for foresight.

If you’re super-confident about your views of future, you won’t bother checking any canary signals. But that would likely be a big mistake. Indeed, an openness to refutation – a willingness to notice developments that were contrary to your expectation – is a vital aspect of managing contingency, managing risk, and managing opportunity.

Selecting a canary signal is a step towards making your view of the future falsifiable. You may say, in effect: I don’t expect this to happen, but if it does, I’ll need to rethink my opinion.

For that reason, Round 1 of my survey Key open questions about the transition to AGI contains the following question:

(14) Agreement on canary signals?

What signs can be agreed, in advance, as indicating that an AI is about to move catastrophically beyond the control of humans, so that some drastic interventions are urgently needed?

Aside: Well-designed continuous audits should provide early warnings.

Note: Human miners used to carry caged canaries into mines, since the canaries would react more quickly than humans to drops in the air quality.

What answer would you give to that question?

The survey home page contains a selection of comments from people who have already completed the survey. For your convenience, I append them below.

That page also gives you the link where you can enter your own answer to any of the questions where you have a clear opinion.

Postscript

I’m already planning Round 2 of the survey, to be launched some time in July. One candidate for inclusion in that second round will be a different question on canary signals, namely What signs can be agreed, in advance, that would lead to revising downward estimates of the risk of catastrophic outcomes from advanced AI?

Appendix: Selected comments from survey participants so far

“Refusing to respond to commands: I’m sorry Dave. I’m afraid I can’t do that” – William Marshall

“Refusal of commands, taking control of systems outside of scope of project, acting in secret of operators.” – Chris Gledhill

“When AI systems communicate using language or code which we cannot interpret or understand. When states lose overall control of critical national infrastructure.” – Anon

“Power-seeking behaviour, in regards to trying to further control its environment, to achieve outcomes.” – Brian Hunter

“The emergence of behavior that was not planned. There have already been instances of this in LLMs.” – Colin Smith

“Behaviour that cannot be satisfactorily explained. Also, requesting access or control of more systems that are fundamental to modern human life and/or are necessary for the AGI’s continued existence, e.g. semiconductor manufacturing.” – Simon

“There have already been harbingers of this kind of thing in the way algorithms have affected equity markets.” – Jenina Bas

“Hallucinating. ChatGPT is already beyond control it seems.” – Terry Raby

“The first signal might be a severe difficulty to roll back to a previous version of the AI’s core software.” – Tony Czarnecki

“[People seem to change there minds about what counts as surprising] For example Protein folding was heralded as such until large parts of it were solved.” – Josef

“Years ago I thought the Turing test was a good canary signal, but given recent progress that no longer seems likely. The transition is likely to be fast, especially from the perspective of relative outsiders. I’d like to see a list of things, even if I expect there will be no agreement.” – Anon

“Any potential ‘disaster’ will be preceded by wide scale adoption and incremental changes. I sincerely doubt we’ll be able to spot that ‘canary’” – Vid

“Nick Bostrom has proposed a qualitative ‘rate of change of intelligence’ as the ratio of ‘optimization power’ and ‘recalcitrance’ (in his book Superintelligence). Not catastrophic per se, of course, but hinting we are facing a real AGI and we might need to hit the pause button.” – Pasquale

“We already have plenty of non-AI systems running catastrophically beyond the control of humans for which drastic interventions are needed, and plenty of people refuse to recognize they are happening. So we need to solve this general problem. I do not have satisfactory answers how.” – Anon

23 June 2023

The rise of AI: beware binary thinking

Filed under: AGI, risks — Tags: , , — David Wood @ 10:20 am

When Max More writes, it’s always worth paying attention.

His recent article Existential Risk vs. Existential Opportunity: A balanced approach to AI risk is no exception. There’s much in that article that deserves reflection.

Nevertheless, there are three key aspects where I see things differently.

The first is the implication that humanity has just two choices:

  1. We are intimidated by the prospect of advanced AI going wrong, so we seek to stop the development and deployment of advanced AI
  2. We appreciate the enormous benefits of advanced AI going right, so we hustle to obtain these benefits as quickly as possible.

From what Max writes, he suggests that an important aspect of winning over the doomsters in camp 1 is to emphasise the wonderful upsides of superintelligent AI.

In that viewpoint, instead of being preoccupied by thoughts of existential risk, we need to emphasise existential opportunity. Things could be a lot better than we have previously imagined, provided we’re not hobbled by doomster pessimism.

However, that binary choice omits the pathway that is actually the most likely to reach the hoped-for benefits of advanced AI. That’s the pathway of responsible development. It’s different from either of the options given earlier.

As an analogy, consider this scenario:

In our journey, we see a wonderful existential opportunity ahead – a lush valley, fertile lands, and gleaming mountain peaks soaring upward to a transcendent realm. But in front of that opportunity is a river of uncertainty, bordered by a swamp of uncertainty, perhaps occupied by hungry predators lurking in shadows.

Are there just two options?

  1. We are intimidated by the possible dangers ahead, and decide not to travel any further
  2. We fixate on the gleaming mountain peaks, and rush on regardless, belittling anyone who warns of piranhas, treacherous river currents, alligators, potential mud slides, and so on

Isn’t there a third option? To take the time to gain a better understanding of the lie of the land ahead. Perhaps there’s a spot, to one side, where it will be easier to cross the river. A spot where a stable bridge can be built. Perhaps we could even build a helicopter that can assist us over the strongest currents…

It’s the same with the landscape of our journey towards the sustainable superabundance that could be achieved, with the assistance of advanced AI, provided we act wisely.

That brings me to my second point of divergence with the analysis Max offers. It’s in the assessment of the nature of the risk ahead.

Max lists a number of factors and suggests they must ALL be true, in order for advanced AI to pose an existential risk. That justifies him in multiplying together probabilities, eventually achieving a very small number.

Heck, with such a small number, that river poses no risk worth worrying about!

But on the contrary, it’s not just a single failure scenario that we need to consider. There are multiple ways in which advanced AI can lead to catastrophe – if it is misconfigured, hacked, has design flaws, encounters an environment that its creators didn’t anticipate, interacts in unforeseen ways with other advanced AIs, etc, etc.

Thus it’s not a matter of multiplying probabilities (getting a smaller number each time). It’s a matter of adding probabilities (getting a larger number).

Quoting Rohit Krishnan, Max lists the following criteria, which he says must ALL hold for us to be concerned about AI catastrophe:

  • Probability the AI has “real intelligence”
  • Probability the AI is “of being “agentic”
  • Probability the AI has “ability to act in the world”
  • Probability the AI is “uncontrollable”
  • Probability the AI is “unique”
  • Probability the AI has “alien morality”
  • Probability the AI is “self-improving”
  • Probability the AI is “deceptive”

That’s a very limited view of future possibilities.

In contrast, in my own writings and presentations, I have outlined four separate families of failure modes. Here’s the simple form of the slide I often use:

And here’s the fully-built version of that slide:

To be clear, the various factors I list on this slide are additive rather than multiplicative.

Also to be clear, I’m definitely not pointing my finger at “bad AI” and saying that it’s AI, by itself, which could lead to our collective demise. Instead, what would cause that outcome would be a combination of adverse developments in two or more of the factors shown in red on this slide:

If you have questions about these slides, you can hear my narrative for them as part of the following video:

If you prefer to read a more careful analysis, I’ll point you at the book I released last year: The Singularity Principles: Anticipating and Managing Cataclysmically Disruptive Technologies.

To recap: those of us who are concerned about the risks of AI-induced catastrophe are, emphatically, not saying any of the following:

  • “We should give up on the possibility of existential opportunity”
  • “We’re all doomed, unless we stop all development of advanced AI”
  • “There’s nothing we could do, to improve the possibility of a wonderful outcome”.

Instead, Singularity Activism sees the possibility of steering the way AI is developed and deployed. That won’t be easy. But there are definitely important steps we can take.

That brings me to the third point where my emphasis differs from Max. Max offers this characterisation of what he calls “precautionary regulation”:

Forbidding trial and error, precautionary regulation reduces learning and reduces the benefits that could have been realized.

Regulations based on the precautionary principle block any innovation until it can be proved safe. Innovations are seen as guilty until proven innocent.

But regulation needn’t be like that. Regulation can, and should, be sensitive to the scale of potential failures. When failures are local – they would just cause “harm” – then there is merit in allowing these errors to occur, and to grow wiser as a result. But when there’s a risk of a global outcome – “ruin” – a different mentality is needed. Namely, the mentality of responsible development and Singularity Activism.

What’s urgently needed, therefore, is:

  • Deeper, thoughtful, investigation into the multiple scenarios in which failures of AI have ruinous consequences
  • Analysis of previous instances, in various industries, when regulation has been effective, and where it has gone wrong
  • A focus on the aspects of the rise of advanced AI for which there are no previous precedents
  • A clearer understanding, therefore, of how we can significantly raise the probability of finding a safe way across that river of uncertainty to the gleaming peaks of sustainable superabundance.

On that matter: If you have views on the transition from today’s AI to the much more powerful AI of the near future, I encourage you to take part in this open survey. Round 1 of that survey is still open. I’ll be designing Round 2 shortly, based on the responses received in Round 1.

7 March 2023

What are the minimum conditions for software global catastrophe?

Filed under: AGI, risks — Tags: , — David Wood @ 11:55 am

Should we be seriously concerned that forthcoming new software systems might cause a global catastrophe?

Or are there, instead, good reasons to dismiss any such concern?

(image by Midjourney)

It’s a vitally important public debate. Alas, this debate is bedevilled by false turnings.

For example, dismissers often make claims with this form:

  • The argument for being concerned assumes that such-and-such a precondition holds
  • But that precondition is suspect (or false)
  • Therefore the concern can be dismissed.

Here’s a simple example – which used to be common, though it appears less often these days:

  • The argument for being concerned assumes that Moore’s Law will hold for the next three decades
  • But Moore’s Law is slowing down
  • Therefore the concern can be dismissed.

Another one:

  • The argument for being concerned assumes that deep learning systems understand what they’re talking about
  • But by such-and-such a definition of understanding, these systems lack understanding
  • (They’re “just stochastic parrots”)
  • Therefore the concern can be dismissed.

Or a favourite:

  • You call these systems AI, meaning they’re (supposedly) artificially intelligent
  • But by such-and-such a definition of intelligence, these systems lack intelligence
  • Therefore the concern can be dismissed.

Perhaps the silliest example:

  • Your example of doom involves a software system that is inordinately obsessed with paperclips
  • But any wise philosopher would design an AI that has no such paperclip obsession
  • Therefore the concern can be dismissed.

My conclusion: those of us who are seriously concerned about the prospects of a software-induced global catastrophe should clarify what are the minimum conditions that would give rise to such a catastrophe.

To be clear, these minimum conditions don’t include the inexorability of Moore’s Law. Nor the conformance of software systems to particular academic models of language understanding. Nor that a fast take-off occurs. Nor that the software system becomes sentient.

Here’s my suggestion of these minimum conditions:

  1. A software system that can influence, directly or indirectly (e.g. by psychological pressure) what happens in the real world
  2. That has access, directly or indirectly, to physical mechanisms that can seriously harm humans
  3. That operates in ways which we might fail to understand or anticipate
  4. That can anticipate actions humans might take, and can calculate and execute countermeasures
  5. That can take actions quickly enough (and/or stealthily enough) to avoid being switched off or reconfigured before catastrophic damage is done.

Even more briefly: the software system operates outside our understanding and outside our control, with potential devastating power.

I’ve chosen to use the term “software” rather than “AI” in order to counter a whole posse of dismissers right at the beginning of the discussion. Not even the smuggest of dismissers denies that software exists and can, indeed, cause harm when it contains bugs, is misconfigured, is hacked, or has gaps in its specification.

Critically, note that software systems often do have real-world impact. Consider the Stuxnet computer worm that caused centrifuges to speed up and destroy themselves inside Iran’s nuclear enrichment facilities. Consider the WannaCry malware that disabled critical hospital equipment around the world in 2017.

Present-day chatbots have already influenced millions of people around the world, via the ideas emerging in chat interactions. Just as people can make life-changing decisions after talking with human therapists or counsellors, people are increasingly taking life-changing decisions following their encounters with the likes of ChatGPT.

Software systems are already involved in the design and operation of military weapons. Presently, humans tend to remain “in the loop”, but military leaders are making the case for humans instead being just “on the loop”, in order for their defence systems to be able to move “at the speed of relevance”.

So the possibility of this kind of software shouldn’t be disputed.

It’s not just military weapons where the potential risk exists. Software systems can be involved with biological pathogens, or with the generation of hate-inducing fake news, or with geoengineering. Or with the manipulation of parts of our infrastructure that we currently only understand dimly, but which might turn out to be horribly fragile, when nudged in particular ways.

Someone wanting to dismiss the risk of software-induced global catastrophe therefore needs to make one or more of the following cases:

  1. All such software systems will be carefully constrained – perhaps by tamperproof failsafe mechanisms that are utterly reliable
  2. All such software systems will remain fully within human understanding, and therefore won’t take any actions that surprise us
  3. All such software systems will fail to develop an accurate “theory of mind” and therefore won’t be able to anticipate human countermeasures
  4. All such software systems will decide, by themselves, to avoid humans experiencing significant harm, regardless of which other goals are found to be attractive by the alien mental processes of that system.

If you still wish to dismiss the risk of software global catastrophe, which of these four cases do you wish to advance?

Or do you have something different in mind?

And can you also be sure that all such software systems will operate correctly, without bugs, configuration failures, gaps in their specification, or being hacked?

Case 2, by the way, includes the idea that “we humans will merge with software and will therefore remain in control of that software”. But in that case, how confident are you that:

  • Humans can speed up their understanding as quickly as the improvement rate of software systems that are free from the constraints of the human skull?
  • Any such “superintelligent” humans will take actions that avoid the same kinds of global catastrophe (after all, some of the world’s most dangerous people have intelligence well above the average)?

Case 4 includes the idea that at least some aspects of morality are absolute, and that a sufficiently intelligent piece of software will discover these principles. But in that case, how confident are you that:

  • The software will decide to respect these principles of morality, rather than (like many humans) disregarding them in order to pursue some other objectives?
  • That these fundamental principles of morality will include the preservation and flourishing of eight billion humans (rather than, say, just a small representative subset in a kind of future “zoo”)?

Postscript: My own recommendations for how to address these very serious risks are in The Singularity Principles. Spoiler alert: there are no magic bullets.

26 February 2023

Ostriches and AGI risks: four transformations needed

Filed under: AGI, risks, Singularity, Singularity Principles — Tags: , — David Wood @ 12:48 am

I confess to having been pretty despondent at various times over the last few days.

The context: increased discussions on social media triggered by recent claims about AGI risk – such as I covered in my previous blogpost.

The cause of my despondency: I’ve seen far too many examples of people with scant knowledge expressing themselves with unwarranted pride and self-certainty.

I call these people the AGI ostriches.

It’s impossible for AGI to exist, one of these ostriches squealed. The probability that AGI can exist is zero.

Anyone concerned about AGI risks, another opined, fails to understand anything about AI, and has just got their ideas from Hollywood or 1950s science fiction.

Yet another claimed: Anything that AGI does in the world will be the inscrutable cosmic will of the universe, so we humans shouldn’t try to change its direction.

Just keep your hand by the off switch, thundered another. Any misbehaving AGI can easily be shut down. Problem solved! You didn’t think of that, did you?

Don’t give the robots any legs, shrieked yet another. Problem solved! You didn’t think of that, did you? You fool!

It’s not the ignorance that depressed me. It was the lack of interest shown by the AGI ostriches regarding alternative possibilities.

I had tried to engage some of the ostriches in conversation. Try looking at things this way, I asked. Not interested, came the answer. Discussions on social media never change any minds, so I’m not going to reply to you.

Click on this link to read a helpful analysis, I suggested. No need, came the answer. Nothing you have written could possibly be relevant.

And the ostriches rejoiced in their wilful blinkeredness. There’s no need to look in that direction, they said. Keep wearing the blindfolds!

(The following image is by the Midjourney AI.)

But my purpose in writing this blogpost isn’t to complain about individual ostriches.

Nor is my purpose to lament the near-fatal flaws in human nature, including our many cognitive biases, our emotional self-sabotage, and our perverse ideological loyalties.

Instead, my remarks will proceed in a different direction. What most needs to change isn’t the ostriches.

It’s the community of people who want to raise awareness of the catastrophic risks of AGI.

That includes me.

On reflection, we’re doing four things wrong. Four transformations are needed, urgently.

Without these changes taking place, it won’t be surprising if the ostriches continue to behave so perversely.

(1) Stop tolerating the Singularity Shadow

When they briefly take off their blindfolds, and take a quick peak into the discussions about AGI, ostriches often notice claims that are, in fact, unwarranted.

These claims confuse matters. They are overconfident claims about what can be expected about the advent of AGI, also known as the Technological Singularity. These claims form part of what I call the Singularity Shadow.

There are seven components in the Singularity Shadow:

  • Singularity timescale determinism
  • Singularity outcome determinism
  • Singularity hyping
  • Singularity risk complacency
  • Singularity term overloading
  • Singularity anti-regulation fundamentalism
  • Singularity preoccupation

If you’ve not come across the concept before, here’s a video all about it:

Or you can read this chapter from The Singularity Principles on the concept: “The Singularity Shadow”.

People who (like me) point out the dangers of badly designed AGI often too easily make alliances with people in the Singularity Shadow. After all, both groups of people:

  • Believe that AGI is possible
  • Believe that AGI might happen soon
  • Believe that AGI is likely to be cause an unprecedented transformation in the human condition.

But the Singularity Shadow causes far too much trouble. It is time to stop being tolerant of its various confusions, wishful thinking, and distortions.

To be clear, I’m not criticising the concept of the Singularity. Far from it. Indeed, I consider myself a singularitarian, with the meaning I explain here. I look forward to more and more people similarly adopting this same stance.

It’s the distortions of that stance that now need to be countered. We must put our own house in order. Sharply.

Otherwise the ostriches will continue to be confused.

(2) Clarify the credible risk pathways

The AI paperclip maximiser has had its day. It needs to be retired.

Likewise the cancer-solving AI that solves cancer by, perversely, killing everyone on the planet.

Likewise the AI that “rescues” a woman from a burning building by hurling her out of the 20th floor window.

In the past, these thought experiments all helped the discussion about AGI risks, among people who were able to see the connections between these “abstract” examples and more complicated real-world scenarios.

But as more of the general public shows an interest in the possibilities of advanced AI, we urgently need a better set of examples. Explained, not by mathematics, nor by cartoonish simplifications, but in plain everyday language.

I’ve tried to offer some examples, for example in the section “Examples of dangers with uncontrollable AI” in the chapter “The AI Control Problem” of my book The Singularity Principles.

But it seems these scenarios still fail to convince. The ostriches find themselves bemused. Oh, that wouldn’t happen, they say.

So this needs more work. As soon as possible.

I anticipate starting from themes about which even the most empty-headed ostrich occasionally worries:

  1. The prospects of an arms race involving lethal autonomous weapons systems
  2. The risks from malware that runs beyond the control of the people who originally released it
  3. The dangers of geoengineering systems that seek to manipulate the global climate
  4. The “gain of function” research which can create ultra-dangerous pathogens
  5. The side-effects of massive corporations which give priority to incentives such as “increase click-through”
  6. The escalation in hatred stirred up by automated trolls with more ingenious “fake social media”

On top of these starting points, the scenarios I envision mix in AI systems with increasing power and increasing autonomy – AI systems which are, however, incompletely understood by the people who deploy them, and which might manifest terrible bugs in unexpected circumstances. (After all, AIs include software, and software generally contains bugs.)

If there’s not already a prize competition to encourage clearer communication of such risk scenarios, in ways that uphold credibility as well as comprehensibility, there should be!

(3) Clarify credible solution pathways

Even more important than clarifying the AGI risk scenarios is to clarify some credible pathways to managing these risks.

Without seeing such solutions, ostriches go into an internal negative feedback loop. They think to themselves as follows:

  • Any possible solution to AGI risks seems unlikely to be successful
  • Any possible solution to AGI risks seems likely to have bad consequences in its own right
  • These thoughts are too horrible to contemplate
  • Therefore we had better believe the AGI risks aren’t actually real
  • Therefore anyone who makes AGI risks seem real needs to be silenced, ridiculed, or mocked.

Just as we need better communication of AGI risk scenarios, we need better communication of positive examples that are relevant to potential solutions:

  • Examples of when society collaborated to overcome huge problems which initially seemed impossible
  • Successful actions against the tolerance of drunk drivers, against dangerous features in car design, against the industrial pollutants which caused acid rain, and against the chemicals which depleted the ozone layer
  • Successful actions by governments to limit the powers of corporate monopolies
  • The de-escalation by Ronald Reagan and Mikhail Gorbachev of the terrifying nuclear arms race between the USA and the USSR.

But we also need to make it clearer how AGI risks can be addressed in practice. This includes a better understanding of:

  • Options for AIs that are explainable and interpretable – with the aid of trusted tools built from narrow AI
  • How AI systems can be designed to be free from the unexpected “emergence” of new properties or subgoals
  • How trusted monitoring can be built into key parts of our infrastructure, to provide early warnings of potential AI-induced catastrophic failures
  • How powerful simulation environments can be created to explore potential catastrophic AI failure modes (and solutions to these issues) in the safety of a virtual model
  • How international agreements can be built up, initially from a “coalition of the willing”, to impose powerful penalties in cases when AI is developed or deployed in ways that violate agreed standards
  • How research into AGI safety can be managed much more effectively, worldwide, than is presently the case.

Again, as needed, significant prizes should be established to accelerate breakthroughs in all these areas.

(4) Divide and conquer

The final transformation needed is to divide up the overall huge problem of AGI safety into more manageable chunks.

What I’ve covered above already suggests a number of vitally important sub-projects.

Specifically, it is surely worth having separate teams tasked with investigating, with the utmost seriousness, a range of potential solutions for the complications that advanced AI brings to each of the following:

  1. The prospects of an arms race involving lethal autonomous weapons systems
  2. The risks from malware that runs beyond the control of the people who originally released it
  3. The dangers of geoengineering systems that seek to manipulate the global climate
  4. The “gain of function” research which can create ultra-dangerous pathogens
  5. The side-effects of massive corporations which give priority to incentives such as “increase click-through”
  6. The escalation in hatred stirred up by automated trolls with more ingenious “fake social media”

(Yes, these are the same six scenarios for catastrophic AI risk that I listed in section (2) earlier.)

Rather than trying to “boil the entire AGI ocean”, these projects each appear to require slightly less boiling.

Once candidate solutions have been developed for one or more of these risk scenarios, the outputs from the different teams can be compared with each other.

What else should be added to the lists above?

23 February 2023

Nuclear-level catastrophe: four responses

36% of respondents agree that it is plausible that AI could produce catastrophic outcomes in this century, on the level of all-out nuclear war.

That’s 36% of a rather special group of people. People who replied to this survey needed to meet the criterion of being a named author on at least two papers published in the last three years in accredited journals in the field of Computational Linguistics (CL) – the field sometimes also known as NLP (Natural Language Processing).

The survey took place in May and June 2022. 327 complete responses were received, by people matching the criteria.

A full report on this survey (31 pages) is available here (PDF).

Here’s a screenshot from page 10 of the report, illustrating the answers to questions about Artificial General Intelligence (AGI):

You can see the responses to question 3-4. 36% of the respondents either “agreed” or “weakly agreed” with the statement that

It is plausible that decisions made by AI or machine learning systems could cause a catastrophe this century that is at least as bad as an all-out nuclear war.

That statistic is a useful backdrop to discussions stirred up in the last few days by a video interview given by polymath autodidact and long-time AGI risk researcher Eliezer Yudkowsky:

The publishers of that video chose the eye-catching title “we’re all gonna die”.

If you don’t want to spend 90 minutes watching that video – or if you are personally alienated by Eliezer’s communication style – here’s a useful twitter thread summary by Liron Shapira:

In contrast to the question posed in the NLP survey I mentioned earlier, Eliezer isn’t thinking about “outcomes of AGI in this century“. His timescales are much shorter. His “ballpark estimate” for the time before AGI arrives is “3-15 years”.

How are people reacting to this sombre prediction?

More generally, what responses are there to the statistic that, as quoted above,

36% of respondents agree that it is plausible that AI could produce catastrophic outcomes in this century, on the level of all-out nuclear war.

I’ve seen a lot of different reactions. They break down into four groups: denial, sabotage, trust, and hustle.

1. Denial

One example of denial is this claim: We’re nowhere near an understanding the magic of human minds. Therefore there’s no chance that engineers are going to duplicate that magic in artificial systems.

I have two counters:

  1. The risks of AGI arise, not because the AI may somehow become sentient, and take on the unpleasant aspects of alpha male human nature. Rather, the risks arise from systems that operate beyond our understanding and outside our control, and which may end up pursuing objectives different from the ones we thought (or wished) we had programmed into them
  2. Many systems have been created over the decades without the underlying science being fully understood. Steam engines predated the laws of thermodynamics. More recently, LLMs (Large Language Model AIs) have demonstrated aspects of intelligence that the designers of these systems had not anticipated. In the same way, AIs with some extra features may unexpectedly tip over into greater general intelligence.

Another example of denial: Some very smart people say they don’t believe that AGI poses risks. Therefore we don’t need to pay any more attention to this stupid idea.

My counters:

  1. The mere fact that someone very smart asserts an idea – likely outside of their own field of special expertise – does not confirm the idea is correct
  2. None of these purported objections to the possibility of AGI risk holds water (for a longer discussion, see my book The Singularity Principles).

Digging further into various online discussion threads, I caught the impression that what was motivating some of the denial was often a terrible fear. The people loudly proclaiming their denial were trying to cope with depression. The thought of potential human extinction within just 3-15 years was simply too dreadful for them to contemplate.

It’s similar to how people sometimes cope with the death of someone dear to them. There’s a chance my dear friend has now been reunited in an afterlife with their beloved grandparents, they whisper to themselves. Or, It’s sweet and honourable to die for your country: this death was a glorious sacrifice. And then woe betide any uppity humanist who dares to suggests there is no afterlife, or that patriotism is the last refuge of a scoundrel!

Likewise, woe betide any uppity AI risk researcher who dares to suggest that AGI might not be so benign after all! Deny! Deny!! Deny!!!

(For more on this line of thinking, see my short chapter “The Denial of the Singularity” in The Singularity Principles.)

A different motivation for denial is the belief that any sufficient “cure” to the risk of AGI catastrophe would be worse than the risk it was trying to address. This line of thinking goes as follows:

  • A solution to AGI risk will involve pervasive monitoring and widespread restrictions
  • That monitoring and restrictions will only be possible if an autocratic world government is put in place
  • Any autocratic world government would be absolutely terrible
  • Therefore, the risk of AGI can’t be that bad after all.

I’ll come back later to the flaws in that particular argument. (In the meantime, see if you can spot what’s wrong.)

2. Sabotage

In the video interview, Eliezer made one suggestion for avoiding AGI catastrophe: Destroy all the GPU server farms.

These vast collections of GPUs (a special kind of computing chip) are what enables the training of many types of AI. If these chips were all put out of action, it would delay the arrival of AGI, giving humanity more time to work out a better solution to coexisting with AGI.

Another suggestion Eliezer makes is that the superbright people who are currently working flat out to increase the capabilities of their AI systems should be paid large amounts of money to do nothing. They could lounge about on a beach all day, and still earn more money than they are currently receiving from OpenAI, DeepMind, or whoever is employing them. Once again, that would slow down the emergence of AGI, and buy humanity more time.

I’ve seen other similar suggestions online, which I won’t repeat here, since they come close to acts of terrorism.

All these suggestions have in common: let’s find ways to stop the development of AI in its tracks, all across the world. Companies should be stopped in their tracks. Shadowy military research groups should be stopped in their tracks. Open source hackers should be stopped in their tracks. North Korean ransomware hackers must be stopped in their tracks.

This isn’t just a suggestion that specific AI developments should be halted, namely those with an explicit target of creating AGI. Instead, it recognises that the creation of AGI might occur via unexpected routes. Improving the performance of various narrow AI systems, including fact-checking, or emotion recognition, or online request interchange marketplaces – any of these might push the collection of AI modules over the critical threshold. Mixing metaphors, AI could go nuclear.

Shutting down all these research activities seems a very tall order. Especially since many of the people who are currently working flat out to increase AI capabilities are motivated, not by money, but by the vision that better AI could do a tremendous amount of good in the world: curing cancer, solving nuclear fusion, improving agriculture by leaps and bounds, and so on. They’re not going to be easy to persuade to change course. For them, there’s a lot more at stake than money.

I have more to say about the question “To AGI or not AGI” in this chapter. In short, I’m deeply sceptical.

In response, a would-be saboteur may admit that their chances of success are low. But what do you suggest instead, they will ask.

Read on.

3. Trust

Let’s start again from the statistic that 36% of the NLP survey respondents agreed, with varying degrees of confidence, that advanced AI could trigger a catastrophe as bad as an all-out nuclear war some time this century.

It’s a pity that the question wasn’t asked with shorter timescales. Comparing the chances of an AI-induced global catastrophe in the next 15 years with one in the next 85 years:

  • The longer timescale makes it more likely that AGI will be developed
  • The shorter timescale makes it more likely that AGI safety research will still be at a primitive (deeply ineffective) level.

Even since the date of the survey – May and June 2022 – many forecasters have shortened their estimates of the likely timeline to the arrival of AGI.

So, for the sake of the argument, let’s suppose that the risk of an AI-induced global catastrophe happening by 2038 (15 years from now) is 1/10.

There are two ways to react to this:

  • 1/10 is fine odds. I feel lucky. What’s more, there are plenty of reasons we ought to feel lucky about
  • 1/10 is terrible odds. That’s far too high a risk to accept. We need to hustle to find ways to change these odds in our favour.

I’ll come to the hustle response in a moment. But let’s first consider the trust response.

A good example is in this comment from SingularityNET founder and CEO Ben Goertzel:

Eliezer is a very serious thinker on these matters and was the core source of most of the ideas in Nick Bostrom’s influential book Superintelligence. But ever since I met him, and first debated these issues with him,  back in 2000 I have felt he had a somewhat narrow view of humanity and the universe in general.   

There are currents of love and wisdom in our world that he is not considering and seems to be mostly unaware of, and that we can tap into by creating self reflective compassionate AGIs and doing good loving works together with them.

In short, rather than fearing humanity, we should learn to trust humanity. Rather than fearing what AGI will do, we should trust that AGI can do wonderful things.

You can find a much longer version of Ben’s views in the review he wrote back in 2015 of Superintelligence. It’s well worth reading.

What are the grounds for hope? Humanity has come through major challenges in the past. Even though the scale of the challenge is more daunting on this occasion, there are also more people contributing ideas and inspiration than before. AI is more accessible than nuclear weapons, which increases the danger level, but AI could also be deployed as part of the solution, rather than just being a threat.

Another idea is that if an AI looks around for data teaching it which values to respect and uphold, it will find plenty of positive examples in great human literature. OK, that literature also includes lots of treachery, and different moral codes often conflict, but a wise AGI should be able to see through all these conclusions to discern the importance of defending human flourishing. OK, much of AI training at the moment focuses on deception, manipulation, enticement, and surveillance, but, again, we can hope that a wise AGI will set aside those nastier aspects of human behaviour. Rather than aping trolls or clickbait, we can hope that AGI will echo the better angels of human nature.

It’s also possible that, just as DeepMind’s AlphaGo Zero worked out by itself, without any human input, superior strategies at the board games Go and Chess, a future AI might work out, by itself, the principles of universal morality. (That’s assuming such principles exist.)

We would still have to hope, in such a case, that the AI that worked out the principles of universal morality would decide to follow these principles, rather than having some alternative (alien) ways of thinking.

But surely hope is better than despair?

To quote Ben Goertzel again:

Despondence is unwarranted and unproductive. We need to focus on optimistically maximizing odds of a wildly beneficial Singularity together.   

My view is the same as expressed by Berkeley professor of AI Stuart Russell, in part of a lengthy exchange with Steven Pinker on the subject of AGI risks:

The meta argument is that if we don’t talk about the failure modes, we won’t be able to address them…

Just like in nuclear safety, it’s not against the rules to raise possible failure modes like, what if this molten sodium that you’re proposing should flow around all these pipes? What if it ever came into contact with the water that’s on the turbine side of the system? Wouldn’t you have a massive explosion which could rip off the containment and so on? That’s not exactly what happened in Chernobyl, but not so dissimilar…

The idea that we could solve that problem without even mentioning it, without even talking about it and without even pointing out why it’s difficult and why it’s important, that’s not the culture of safety. That’s sort of more like the culture of the communist party committee in Chernobyl, that simply continued to assert that nothing bad was happening.

(By the way, my sympathies in that long discussion, when it comes to AGI risk, are approximately 100.0% with Russell and approximately 0.0% with Pinker.)

4. Hustle

The story so far:

  • The risks are real (though estimates of their probability vary)
  • Some possible “solutions” to the risks might produce results that are, by some calculations, worse than letting AGI take its own course
  • If we want to improve our odds of survival – and, indeed, for humanity to reach something like a sustainable superabundance with the assistance of advanced AIs – we need to be able to take a clear, candid view of the risks facing us
  • Being naïve about the dangers we face is unlikely to be the best way forward
  • Since time may be short, the time to press for better answers is now
  • We shouldn’t despair. We should hustle.

Some ways in which research could generate useful new insight relatively quickly:

  • When the NLP survey respondents expressed their views, what reasons did they have for disagreeing with the statement? And what reasons did they have for agreeing with it? And how do these reasons stand up, in the cold light of a clear analysis? (In other words, rather than a one-time survey, an iterative Delphi survey should lead to deeper understanding.)
  • Why have the various AI safety initiatives formed in the wake of the Puerto Rico and Asilomar conferences of 2015 and 2017 fallen so far short of expectations?
  • Which descriptions of potential catastrophic AI failure modes are most likely to change the minds of those critics who currently like to shrug off failure scenarios as “unrealistic” or “Hollywood fantasy”?

Constructively, I invite conversation on the strengths and weaknesses of the 21 Singularity Principles that I have suggested as contributing to improving the chances of beneficial AGI outcomes.

For example:

  • Can we identify “middle ways” that include important elements of global monitoring and auditing of AI systems, without collapsing into autocratic global government?
  • Can we improve the interpretability and explainability of advanced AI systems (perhaps with the help of trusted narrow AI tools), to diminish the risks of these systems unexpectedly behaving in ways their designers failed to anticipate?
  • Can we deepen our understanding of the ways new capabilities “emerge” in advanced AI systems, with a particular focus on preventing the emergence of alternative goals?

I also believe we should explore more fully the possibility that an AGI will converge on a set of universal values, independent of whatever training we provide it – and, moreover, the possibility that these values will include upholding human flourishing.

And despite me saying just now that these values would be “independent of whatever training we provide”, is there, nevertheless, a way for us to tilt the landscape so that the AGI is more likely to reach and respect these conclusions?

Postscript

To join me in “camp hustle”, visit Future Surge, which is the activist wing of London Futurists.

If you’re interested in the ideas of my book The Singularity Principles, here’s a podcast episode in which Calum Chace and I discuss some of these ideas more fully.

In a subsequent episode of our podcast, Calum and I took another look at the same topics, this time with Millennium Project Executive Director Jerome Glenn: “Governing the transition to AGI”.

19 December 2022

Rethinking

Filed under: AGI, politics, Singularity Principles — Tags: , , — David Wood @ 2:06 am

I’ve been rethinking some aspects of AI control and AI alignment.

In the six months since publishing my book The Singularity Principles: Anticipating and Managing Cataclysmically Disruptive Technologies, I’ve been involved in scores of conversations about the themes it raises. These conversations have often brought my attention to fresh ideas and different perspectives.

These six months have also seen the appearance of numerous new AI models with capabilities that often catch observers by surprise. The general public is showing a new willingness (at least some of the time) to consider the far-reaching implications of these AI models and their more powerful successors.

People from various parts of my past life have been contacting me. The kinds of things they used to hear me forecasting – the kinds of things they thought, at the time, were unlikely to ever happen – are becoming more credible, more exciting, and, yes, more frightening.

They ask me: What is to be done? And, pointedly, Why aren’t you doing more to stop the truly bad outcomes that now seem ominously likely?

The main answer I give is: read my book. Indeed, you can find all the content online, spread out over a family of webpages.

Indeed, my request is that people should read my book all the way through. That’s because later chapters of that book anticipate questions that tend to come to readers’ minds during earlier chapters, and try to provide answers.

Six months later, although I would give some different (newer) examples were I to rewrite that book today, I stand by the analysis I offered and the principles I championed.

However, I’m inclined to revise my thinking on a number of points. Please find these updates below.

An option to control superintelligent AI

I remain doubtful about the prospects for humans to retain control of any AGI (Artificial General Intelligence) that we create.

That is, the arguments I gave in my chapter “The AI Control Problem” still look strong to me.

But one line of thinking may have some extra mileage. That’s the idea of keeping AGI entirely as an advisor to humans, rather than giving it any autonomy to act directly in the world.

Such an AI would provide us with many recommendations, but it wouldn’t operate any sort of equipment.

More to the point: such an AI would have no desire to operate any sort of equipment. It would have no desires whatsoever, nor any motivations. It would simply be a tool. Or, to be more precise, it would simply be a remarkable tool.

In The Singularity Principles I gave a number of arguments why that idea is unsustainable:

  • Some decisions require faster responses than slow-brained humans can provide; that is, AIs with direct access to real-world levers and switches will be more effective than those that are merely advisory
  • Smart AIs will inevitably develop “subsidiary goals” (intermediate goals) such as having greater computational power, even when there is no explicit programming for such goals
  • As soon as a smart AI acquires any such subsidiary goal, it will find ways to escape any confinement imposed by human overseers.

But I now think this should be explored more carefully. Might a useful distinction be made between:

  1. AIs that do have direct access to real-world levers and switches – with the programming of such AIs being carefully restricted to narrow lines of thinking
  2. AIs with more powerful (general) capabilities, that operate purely in advisory capacities.

In that case, the damage that could be caused by failures of the first type of AI, whilst significant, would not involve threats to the entirety of human civilisation. And failures of the second type of AI would be restricted by the actions of humans as intermediaries.

This approach would require confidence that:

  1. The capabilities of AIs of the first type will remain narrow, despite competitive pressures to give these systems at least some extra rationality
  2. The design of AIs of the second type will prevent the emergence of any dangerous “subsidiary goals”.

As a special case of the second point, the design of these AIs will need to avoid any risk of the systems developing sentience or intrinsic motivation.

These are tough challenges – especially since we still have only a vague understanding of how desires and/or sentience can emerge as smaller systems combine and evolve into larger ones.

But since we are short of other options, it’s definitely something to be considered more fully.

An option for automatically aligned superintelligence

If controlling an AGI turns out to be impossible – as seems likely – what about the option that an AGI will have goals and principles that are fundamentally aligned with human wellbeing?

In such a case, it will not matter if an AGI is beyond human control. The actions it takes will ensure that humans have a very positive future.

The creation of such an AI – sometimes called a “friendly AI” – remains my best hope for humanity’s future.

However, there are severe difficulties in agreeing and encoding “goals and principles that are fundamentally aligned with human wellbeing”. I reviewed these difficulties in my chapter “The AI Alignment Problem”.

But what if such goals and principles are somehow part of an objective reality, awaiting discovery, rather than needing to be invented? What if something like the theory of “moral realism” is true?

In this idea, a principle like “treat humans well” would follow from some sort of a priori logical analysis, a bit like the laws of mathematics (such as the fact, discovered by one of the followers of Pythagoras, that the square root of two is an irrational number).

Accordingly, a sufficiently smart AGI would, all being well, reach its own conclusion that humans ought to be well treated.

Nevertheless, even in this case, significant risks would remain:

  • The principle might be true, but an AGI might not be motivated to discover it
  • The principle might be true, but an AGI, despite its brilliance, may fail to discover it
  • The principle might be true, and an AGI might recognise it, but it may take its own decision to ignore it – like the way that we humans often act in defiance of what we believe at the time to be overarching moral principles

The design criteria and initial conditions that we humans provide for an AGI may well influence the outcome of these risk factors.

I plan to return to these weighty matters in a future blog post!

Two different sorts of control

I’ve come to realise that there are not one but two questions of control of AI:

  1. Can we humans retain control of an AGI that we create?
  2. Can society as a whole control the actions of companies (or organisations) that may create an AGI?

Whilst both these control problems are profoundly hard, the second is less hard.

Moreover, it’s the second problem which is the truly urgent one.

This second control problem involves preventing teams inside corporations (and other organisations) from rushing ahead without due regard to questions of the potential outcomes of their work.

It’s the second control problem that the 21 principles which I highlight in my book are primarily intended to address.

When people say “it’s impossible to solve the AI control problem”, I think they may be correct regarding the first problem, but I passionately believe they’re wrong concerning the second problem.

The importance of psychology

When I review what people say about the progress and risks of AI, I am frequently struck by the fact that apparently intelligent people are strongly attached to views that are full of holes.

When I try to point out the flaws in their thinking, they hardly seem to pause in their stride. They portray a stubborn confidence that they are sure they are correct.

What’s at play here is more than logic. It’s surely a manifestation of humanity’s often defective psychology.

My book includes a short chapter “The denial of the Singularity” which touched on various matters of psychology. If I were to rewrite my book today, I believe that chapter would become larger, and that psychological themes would be spread more widely throughout the book.

Of course, noticing psychological defects is only the start of making progress. Circumventing or transcending these defects is an altogether harder question. But it’s one that needs a lot more attention.

The option of merging with AI

How can we have a better, more productive conversation about anticipating and managing AGI?

How can we avoid being derailed by ineffective arguments, hostile rhetoric, stubborn prejudices, hobby-horse obsessions, outdated ideologies, and (see the previous section) flawed psychology?

How might our not-much-better-than-monkey brains cope with the magnitude of these questions?

One possible answer is that technology can help us (so long as we use it wisely).

For example, the chapter “Uplifting politics”, from near the end of my book, listed ten ways for “technology improving politics”.

More broadly, we humans have the option to selectively deploy some aspects of technology to improve our capabilities in handling other aspects of technology.

We must recognise that technology is no panacea. But it can definitely make a big difference.

Especially if we restrict ourselves to putting heavy reliance only on those technologies – narrow technologies – whose mode of operation we fully understand, and where risks of malfunction can be limited.

This forms part of a general idea that “we humans don’t need to worry about being left behind by robots, or about being subjugated by robots, since we will be the robots”.

As I put it in the chapter “No easy solutions” in my book,

If humans merge with AI, humans could remain in control of AIs, even as these AIs rapidly become more powerful. With such a merger in place, human intelligence will automatically be magnified, as AI improves in capability. Therefore, we humans wouldn’t need to worry about being left behind.

Now I’ve often expressed strong criticisms of this notion of merger. I still believe these criticisms are sound.

But what these criticisms show is that any such merger cannot be the entirety of our response to the prospect of the emergence of AGI. They can only be part of the solution. That’s especially true because humans-augmented-by-technology are still very likely to lag behind pure technology systems, until such time as human minds might be removed from biological skulls and placed into new silicon hosts. That’s something that I’m not expecting to happen before the arrival of AGI, so it will be too late to solve (by itself) the problems of AI alignment and control.

(And since you ask, I probably won’t be in any hurry, even after the arrival of AGI, for my mind to be removed from my biological skull. I guess I might rethink that reticence in due course. But that’s rethinking for another day.)

The importance of politics

Any serious discussion about managing cataclysmically disruptive technologies (such as advanced AIs) pretty soon rubs up against the questions of politics.

That’s not just small-p “politics” – questions of how to collaborate with potential partners where there are many points of disagreement and even dislike.

It’s large-P “Politics” – interacting with presidents, prime ministers, cabinets, parliaments, and so on.

Questions of large-P politics occur throughout The Singularity Principles. My thoughts now, six months afterwards, is that even more focus should be placed on the subject of improving politics:

  • Helping politics to escape the clutches of demagogues and autocrats
  • Helping politics to avoid stultifying embraces between politicians and their “cronies” in established industries
  • Ensuring that the best insights and ideas of the whole electorate can rise to wide attention, without being quashed or distorted by powerful incumbents
  • Bringing everyone involved in politics rapidly up-to-date with the real issues regarding cataclysmically disruptive technologies
  • Distinguishing effective regulations and incentives from those that are counter-productive.

As 2022 has progressed, I’ve seen plenty new evidence of deep problems within political systems around the world. These problems were analysed with sharp insight in the book The Revenge of Power by Moisés Naím that I recently identified as “the best book that I read in 2022”.

Happily, as well as evidence of deep problems in our politics worldwide, there are also encouraging signs, as well as sensible plans for improvement. You can find some of these plans inside the book by Naím, and, yes, I offer suggestions in my own book too.

To accelerate improvements in politics was one of the reasons I created Future Surge a few months back. That’s an initiative on which I expect to spend a lot more of my time in 2023.

Note: the image underlying the picture at the top of this article was created by DALL.E 2 from the prompt “A brain with a human face on it rethinks, vivid stormy sky overhead, photorealistic style”.

3 November 2022

Four options for avoiding an AI cataclysm

Let’s consider four hard truths, and then four options for a solution.

Hard truth 1: Software has bugs.

Even when clever people write the software, and that software passes numerous verification tests, any complex software system generally still has bugs. If the software encounters a circumstance outside its verification suite, it can go horribly wrong.

Hard truth 2: Just because software becomes more powerful, that won’t make all the bugs go away.

Newer software may run faster. It may incorporate input from larger sets of training data. It may gain extra features. But none of these developments mean the automatic removal of subtle errors in the logic of the software, or shortcomings in its specification. It might still reach terrible outcomes – just quicker than before!

Hard truth 3: As AI becomes more powerful, there will be more pressure to deploy it in challenging real-world situations.

Consider the real-time management of:

  • Complex arsenals of missiles, anti-missile missiles, and so on
  • Geoengineering interventions, which are intended to bring the planet’s climate back from the brink of a cascade of tipping points
  • Devious countermeasures against the growing weapons systems of a group (or nation) with a dangerously unstable leadership
  • Social network conversations, where changing sentiments can have big implications for electoral dynamics or for the perceived value of commercial brands
  • Ultra-hot plasmas inside whirling magnetic fields in nuclear fusion energy generators
  • Incentives for people to spend more money than is wise, on addictive gambling sites
  • The buying and selling of financial instruments, to take advantage of changing market sentiments.

In each case, powerful AI software could be a very attractive option. A seductive option. Especially if it has been written by clever people, and appears to have a good track record of delivering results.

Until it goes wrong. In which case the result could be cataclysmic. (Accidental nuclear war. The climate walloped past a tipping point in the wrong direction. Malware going existentially wrong. Partisan outrage propelling a psychological loose cannon over the edge. Easy access to weapons of mass destruction. Etc.)

Indeed, the real risk of AI cataclysm – as opposed to the Hollywood version of any such risk – is that an AI system may acquire so much influence over human society and our surrounding environment that a mistake in that system could cataclysmically reduce human wellbeing all over the world. Billions of lives could be extinguished, or turned into a very pale reflection of their present state.

Such an outcome could arise in any of four ways – four catastrophic error modes. In brief, these are:

  1. Implementation defect
  2. Design defect
  3. Design overridden
  4. Implementation overridden.

Hard truth 4: There are no simple solutions to the risks described above.

What’s more, people who naively assume that a simple solution can easily be put in place (or already exists) are making the overall situation worse. They encourage complacency, whereas greater attention is urgently needed.

But perhaps you disagree?

That’s the context for the conversation in Episode 11 of the London Futurists Podcast, which was published yesterday morning.

In just thirty minutes, that episode dug deep into some of the ideas in my recent book The Singularity Principles. Co-host Calum Chace and I found plenty on which to agree, but had differing opinions on one of the most important questions.

Calum listed three suggestions that people sometimes make for how the dangers of potentially cataclysmic AI might be handled.

In response, I described a different approach – something that Calum said would be a fourth idea for a solution. As you can hear from the recording of the podcast, I evidently left him unconvinced.

Therefore, I’d like to dig even deeper.

Option 1: Humanity gets lucky

It might be the case that AI software that is smart enough, will embody an unshakeable commitment toward humanity having the best possible experience.

Such software won’t miscalculate (after all, it is superintelligent). If there are flaws in how it has been specified, it will be smart enough to notice these flaws, rather than stubbornly following through on the letter of its programming. (After all, it is superintelligent.)

Variants of this wishful thinking exist. In some variants, what will guarantee a positive outcome isn’t just a latent tendency of superintelligence toward superbenevolence. It’s the invisible hand of the free market that will guide consumer choices away from software that might harm users, toward software that never, ever, ever goes wrong.

My response here is that software which appears to be bug free can, nevertheless, harbour deep mistakes. It may be superintelligent, but that doesn’t mean it’s omniscient or infallible.

Second, software which is bug free may be monstrously efficient at doing what some of its designers had in mind – manipulating consumers into actions which increase the share price of a given corporation, despite all the externalities arising.

Moreover, it’s too much of a stretch to say that greater intelligence always makes your wiser and kinder. There are plenty of dreadful counterexamples, from humans in the worlds of politics, crime, business, academia, and more. Who is to say that a piece of software with an IQ equivalent to 100,000 will be sure to treat us humans any better than we humans sometimes treat swarms of insects (e.g. ant colonies) that get in our way?

Do you feel lucky? My view is that any such feeling, in these circumstances, is rash in the extreme.

Option 2: Safety engineered in

Might a team of brilliant AI researchers, Mary and Flo (to make up a couple of names), devise a clever method that will ensure their AI (once it is built) never harms humanity?

Perhaps the answer lies in some advanced mathematical wizardry. Or in chiselling a 21st century version of Asimov’s Laws of Robotics into the chipsets at the heart of computer systems. Or in switching from “correlation logic” to “causation logic”, or some other kind of new paradigm in AI systems engineering.

Of course, I wish Mary and Flo well. But their ongoing research won’t, by itself, prevent lots of other people releasing their own unsafe AI first. Especially when these other engineers are in a hurry to win market share for their companies.

Indeed, the considerable effort being invested by various researchers and organisations in a search for a kind of fix for AI safety is, arguably, a distraction from a sober assessment of the bigger picture. Better technology, better product design, better mathematics, and better hardware can all be part of the full solution. But that full solution also needs, critically, to include aspects of organisational design, economic incentives, legal frameworks, and political oversight. That’s the argument I develop in my book. We ignore these broader forces at our peril.

Option 3: Humans merge with machines

If we can’t beat them, how about joining them?

If human minds are fused into silicon AI systems, won’t the good human sense of these minds counteract any bugs or design flaws in the silicon part of the hybrid formed?

With such a merger in place, human intelligence will automatically be magnified, as AI improves in capability. Therefore, we humans wouldn’t need to worry about being left behind. Right?

I see two big problems with this idea. First, so long as human intelligence is rooted in something like the biology of the brain, the mechanisms for any such merger may only allow relatively modest increases in human intelligence. Our biological brains would be bottlenecks that constrain the speed of progress in this hybrid case. Compared to pure AIs, the human-AI hybrid would, after all, be left behind in this intelligence race. So much for humans staying in control!

An even bigger problem is the realisation that a human with superhuman intelligence is likely to be at least as unpredictable and dangerous as an AI with superhuman intelligence. The magnification of intelligence will allow that superhuman human to do all kinds of things with great vigour – settling grudges, acting out fantasies, demanding attention, pursuing vanity projects, and so on. Recall: power tends to corrupt. Such a person would be able to destroy the earth. Worse, they might want to do so.

Another way to state this point is that, just because AI elements are included inside a person, that won’t magically ensure that these elements become benign, or are subject to the full control of the person’s best intentions. Consider as comparisons what happens when biological viruses enter a person’s body, or when a cancer grows there. In neither case does the intruding element lose its ability to cause damage, just on account of being part of a person who has humanitarian instincts.

This reminds me of the statement that is sometimes heard, in defence of accelerating the capabilities of AI systems: “I am not afraid of artificial intelligence. I am afraid of human stupidity”.

In reality, what we need to fear is the combination of imperfect AI and imperfect humanity.

The conclusion of this line of discussion is that we need to do considerably more than enable greater intelligence. We also need to accelerate greater wisdom – so that any beings with superhuman intelligence will operate truly beneficently.

Option 4: Greater wisdom

The cornerstone insight of ethics is that, just because we can do something, and indeed may even want to do that thing, it doesn’t mean we should do that thing.

Accordingly, human societies since prehistory have placed constraints on how people should behave.

Sometimes, moral sanction is sufficient: people constrain their actions in deference to public opinion. In other cases, restrictions are codified into laws and regulations.

Likewise, just because a corporation could boost its profits by releasing a new version of its AI software, that doesn’t mean it should release that software.

But what is the origin of these “should” imperatives? And how do we resolve conflicts, when two different groups of people champion two different sets of ethical intuitions?

Where can we find a viable foundation for ethical restrictions – something more solid than “we’ve always done things like this” or “this feels right to me” or “we need to submit to the dictates in our favourite holy scripture”?

Welcome to the world of philosophy.

It’s a world that, according to some observers, has made little progress over the centuries. People still argue over fundamentals. Deontologists square off against consequentialists. Virtue ethicists stake out a different position.

It’s a world in which it is easier to poke holes in the views held by others, rather than defending a consistent view of your own.

But it’s my position that the impending threat of cataclysmic AI impels us to reach a wiser agreement.

It’s like how the devastation of the Covid pandemic impelled society to find significantly quicker ways to manufacture, verify, and deploy vaccines.

It’s like how society can come together, remarkably, in a wartime situation, notwithstanding the divisions that previously existed.

In the face of the threats of technology beyond our control, minds should focus, with unprecedented clarity. We’ll gradually build a wider consensus in favour of various restrictions and, yes, in favour of various incentives.

What’s your reaction? Is option 4 simply naïve?

Practical steps forward

Rather than trying to “boil the ocean” of philosophical disputes over contrasting ethical foundations, we can, and should, proceed in a kaizen manner.

To start with, we can give our attention to specific individual questions:

  • What are the circumstances when we should welcome AI-powered facial recognition software, and when should we resist it?
  • What are the circumstances when we should welcome AI systems that supervise aspects of dangerous weaponry?
  • What are the circumstances that could transform AI-powered monitoring systems from dangerous to helpful?

As we reach some tentative agreements on these individual matters, we can take the time to highlight principles with potential wider applicability.

In parallel, we can revisit some of the agreements (explicit and implicit) for how we measure the health of society and the liberties of individuals:

  • The GDP (Gross Domestic Product) statistics that provide a perspective on economic activities
  • The UDHR (Universal Declaration of Human Rights) statement that was endorsed in the United Nations General Assembly in 1948.

I don’t deny it will be hard to build consensus. It will be even harder to agree how to enforce the guidelines arising – especially in light of the wretched partisan conflicts that are poisoning the political processes in a number of parts of the world.

But we must try. And with some small wins under our belt, we can anticipate momentum building.

These are some of the topics I cover in the closing chapters of The Singularity Principles:

I by no means claim to know all the answers.

But I do believe that these are some of the most important questions to address.

And to help us make progress, something that could help us is – you guessed it – AI. In the right circumstances, AI can help us think more clearly, and can propose new syntheses of our previous ideas.

Thus today’s AI can provide stepping stones to the design and deployment of better, safer, wiser AI tomorrow. That’s provided we maintain human oversight.

Footnotes

The image above includes a design by Pixabay user Alexander Antropov, used with thanks.

See also this article by Calum in Forbes, Taking Back Control Of The Singularity.

15 May 2022

A year-by-year timeline to 2045

The ground rules for the worldbuilding competition were attractive:

  • The year is 2045.
  • AGI has existed for at least 5 years.
  • Technology is advancing rapidly and AI is transforming the world sector by sector.
  • The US, EU and China have managed a steady, if uneasy, power equilibrium.
  • India, Africa and South America are quickly on the ride as major players.
  • Despite ongoing challenges, there have been no major wars or other global catastrophes.
  • The world is not dystopian and the future is looking bright.

Entrants were asked to submit four pieces of work. One was a new media piece. I submitted this video:

Another required piece was:

timeline with entries for each year between 2022 and 2045 giving at least two events (e.g. “X invented”) and one data point (e.g. “GDP rises by 25%”) for each year.

The timeline I created dovetailed with the framework from the above video. Since I enjoyed creating it, I’m sharing my submission here, in the hope that it may inspire readers.

(Note: the content was submitted on 11th April 2022.)

2022

US mid-term elections result in log-jammed US governance, widespread frustration, and a groundswell desire for more constructive approaches to politics.

The collapse of a major crypto “stablecoin” results in much wider adverse repercussions than was generally expected, and a new social appreciation of the dangers of flawed financial systems.

Data point: Number of people killed in violent incidents (including homicides and armed conflicts) around the world: 590,000

2023

Fake news that is spread by social media driven by a new variant of AI provokes riots in which more than 10,000 people die, leading to much greater interest a set of “Singularity Principles” that had previously been proposed to steer the development of potentially world-transforming technologies.

G7 transforms into the D16, consisting of the world’s 16 leading democracies, proclaiming a profound shared commitment to champion norms of: openness; free and fair elections; the rule of law; independent media, judiciary, and academia; power being distributed rather than concentrated; and respect for autonomous decisions of groups of people.

Data point: Proportion of world population living in countries that are “full democracies” as assessed by the Economist: 6.4%

2024

South Korea starts a trial of a nationwide UBI scheme, in the first of what will become in later years a long line of increasingly robust “universal citizens’ dividends” schemes around the world.

A previously unknown offshoot of ISIS releases a bioengineered virus. Fortunately, vaccines are quickly developed and deployed against it. In parallel, a bitter cyber war takes place between Iran and Israel. These incidents lead to international commitments to prevent future recurrences.

Data point: Proportion of people of working age in US who are not working and who are not looking for a job: 38%

2025

Extreme weather – floods and storms – kills 10s of 1000s in both North America and Europe. A major trial of geo-engineering is rushed through, with reflection of solar radiation in the stratosphere – causing global political disagreement and then a renewed determination for tangible shared action on climate change.

The US President appoints a Secretary for the Future as a top-level cabinet position. More US states adopt rank choice voting, allowing third parties to grow in prominence.

Data point: Proportion of earth’s habitable land used to rear animals for human food: 38%

2026

A song created entirely by an AI tops the hit parade, and initiates a radical new musical genre.

Groundswell opposition to autocratic rule in Russia leads to the fall from power of the president and a new dedication to democracy throughout countries formerly perceived as being within Russia’s sphere of direct influence.

Data point: Net greenhouse gas emissions (including those from land-use changes): 59 billion tons of CO2 equivalent – an unwelcome record.

2027

Metformin approved for use as an anti-aging medicine in a D16 country. Another D16 country recommends nationwide regular usage of a new nootropic drug.

Exchanges of small numbers of missiles between North and South Korea leads to regime change inside North Korea and a rapprochement between the long-bitter enemies.

Data point: Proportion of world population living in countries that are “full democracies” as assessed by the Economist: 9.2%

2028

An innovative nuclear fusion system, with its design assisted by AI, runs for more than one hour and generates significantly more energy out than what had been put in.

As a result of disagreements about the future of an independent Taiwan, an intense destructive cyber battle takes place. At the end, the nations of the world commit more seriously than before to avoiding any future cyber battles.

Data point: Proportion of world population experiencing mental illness or dissatisfied with the quality of their mental health: 41%

2029

A trial of an anti-aging intervention in middle-aged dogs is confirmed to have increased remaining life expectancy by 25% without causing any adverse side effects. Public interest in similar interventions in humans skyrockets.

The UK rejoins a reconfigured EU, as an indication of support for sovereignty that is pooled rather than narrow.

Data point: Proportion of world population with formal cryonics arrangements: 1 in 100,000

2030

Russia is admitted into the D40 – a newly expanded version of the D16. The D40 officially adopts “Index of Human Flourishing” as more important metric than GDP, and agrees a revised version of the Universal Declaration of Human Rights, brought up to date with transhuman issues.

First permanent implant in a human of an artificial heart with a new design that draws all required power from the biology of the body rather than any attached battery, and whose pace of operation is under the control of the brain.

Data point: Net greenhouse gas emissions (including those from land-use changes): 47 billion tons of CO2 equivalent – a significant improvement

2031

An AI discovers and explains a profound new way of looking at mathematics, DeepMath, leading in turn to dramatically successful new theories of fundamental physics.

Widespread use of dynamically re-programmed nanobots to treat medical conditions that would previously have been fatal.

Data point: Proportion of world population regularly taking powerful anti-aging medications: 23%

2032

First person reaches the age of 125. Her birthday celebrations are briefly disrupted by a small group of self-described “naturality advocates” who chant “120 is enough for anyone”, but that group has little public support.

D40 countries put in place a widespread “trustable monitoring system” to cut down on existential risks (such as spread of WMDs) whilst maintaining citizens’ trust.

Data point: Proportion of world population living in countries that are “full democracies” as assessed by the Economist: 35.7% 

2033

For the first time since the 1850s, the US President comes from a party other than Republican and Democratic.

An AI system is able to convincingly pass the Turing test, impressing even the previous staunchest critics with its apparent grasp of general knowledge and common sense. The answers it gives to questions of moral dilemmas also impress previous sceptics.

Data point: Proportion of people of working age in US who are not working and who are not looking for a job: 58%

2034

The D90 (expanded from the D40) agrees to vigorously impose Singularity Principles rules to avoid inadvertent creation of dangerous AGI.

Atomically precise synthetic nanoscale assembly factories have come of age, in line with the decades-old vision of nanotechnology visionary Eric Drexler, and are proving to have just as consequential an impact on human society as AI.

Data point: Net greenhouse gas *removals*: 10 billion tons of CO2 equivalent – a dramatic improvement

2035

A novel written entirely by an AI reaches the top of the New York Times bestseller list, and is widely celebrated as being the finest piece of literature ever produced.

Successful measures to remove greenhouse gases from the atmosphere, coupled with wide deployment of clean energy sources, lead to a declaration of “victory over runaway climate change”.

Data point: Proportion of earth’s habitable land used to rear animals for human food: 4%

2036

A film created entirely by an AI, without any real human actors, wins Oscar awards.

The last major sceptical holdout, a philosophy professor from an Ivy League university, accepts that AGI now exists. The pope gives his blessing too.

Data point: Proportion of world population with cryonics arrangements: 24%

2037

The last instances of the industrial scale slaughter of animals for human consumption, on account of the worldwide adoption of cultivated (lab-grown) meat.

AGI convincingly explains that it is not sentient, and that it has a very different fundamental structure from that of biological consciousness.

Data point: Proportion of world population who are literate: 99.3%

2038

Rejuvenation therapies are in wide use around the world. “Eighty is the new fifty”. First person reaches the age of 130.

Improvements made by AGI upon itself effectively raise its IQ one hundred fold, taking it far beyond the comprehension of human observers. However, the AGI provides explanatory educational material that allows people to understand vast new sets of ideas.

Data point: Proportion of world population who consider themselves opposed to AGI: 0.1%

2039

An extensive set of “vital training” sessions has been established by the AGI, with all citizens over the age of ten participating for a minimum of seven hours per day on 72 days each year, to ensure that humans develop and maintain key survival skills.

Menopause reversal is common place. Women who had long ago given up any ideas of bearing another child happily embrace motherhood again.

Data point: Proportion of world population regularly taking powerful anti-aging medications: 99.2%

2040

The use of “mind phones” is widespread: new brain-computer interfaces that allow communication between people by mental thought alone.

People regularly opt to have several of their original biological organs replaced by synthetic alternatives that are more efficient, more durable, and more reliable.

Data point: Proportion of people of working age in US who are not working and who are not looking for a job: 96%

2041

Shared immersive virtual reality experiences include hyper-realistic simulations of long-dead individuals – including musicians, politicians, royalty, saints, and founders of religions.

The number of miles of journey undertaken by small “flying cars” exceeds that of ground-based powered transport.

Data point: Proportion of world population living in countries that are “full democracies” as assessed by the Economist: 100.0%

2042

First successful revival of mammal from cryopreservation.

AGI presents a proof of the possibility of time travel, but the resources required for safe transit of humans through time would require the equivalent of building a Dyson sphere around the sun.

Data point: Proportion of world population experiencing mental illness or dissatisfied with the quality of their mental health: 0.4%

2043

First person reaches the age of 135, and declares herself to be healthier than at any time in the preceding four decades.

As a result of virtual reality encounters of avatars of founders of religion, a number of new systems of philosophical and mystical thinking grow in popularity.

Data point: Proportion of world’s energy provided by earth-based nuclear fusion: 75%

2044

First human baby born from an ectogenetic pregnancy.

Family holidays on the Moon are an increasingly common occurrence.

Data point: Average amount of their waking time that people spend in a metaverse: 38%

2045

First revival of human from cryopreservation – someone who had been cryopreserved ten years previously.

Subtle messages decoded by AGI from far distant stars in the galaxy confirm that other intelligent civilisations exist, and are on their way to reveal themselves to humanity.

Data point: Number of people killed in violent incidents around the world: 59

Postscript

My thanks go to the competition organisers, the Future of Life Institute, for providing the inspiration for the creation of the above timeline.

Readers are likely to have questions in their minds as they browse the timeline above. More details of the reasoning behind the scenarios involved are contained in three follow-up posts:

« Newer PostsOlder Posts »

Blog at WordPress.com.