dw2

25 August 2025

The biggest blockages to successful governance of advanced AI

“Humanity has never faced a greater problem than itself.”

That phrase was what my brain hallucinated, while I was browsing the opening section of the Introduction of the groundbreaking new book Global Governance of the Transition to Artificial General Intelligence written by my friend and colleague Jerome C. Glenn, Executive Director of The Millennium Project.

I thought to myself: That’s a bold but accurate way of summing up the enormous challenge faced by humanity over the next few years.

In previous centuries, our biggest problems have often come from the environment around us: deadly pathogens, devastating earthquakes, torrential storms, plagues of locusts – as well as marauding hordes of invaders from outside our local neighbourhood.

But in the second half of the 2020s, our problems are being compounded as never before by our own human inadequacies:

  • We’re too quick to rush to judgement, seeing only parts of the bigger picture
  • We’re too loyal to the tribes to which we perceive ourselves as belonging
  • We’re overconfident in our ability to know what’s happening
  • We’re too comfortable with manufacturing and spreading untruths and distortions
  • We’re too bound into incentive systems that prioritise short-term rewards
  • We’re too fatalistic, as regards the possible scenarios ahead.

You may ask, What’s new?

What’s new is the combination of these deep flaws in human nature with technology that is remarkably powerful yet opaque and intractable. AI that is increasingly beyond our understanding and beyond our control is being coupled in potentially devastating ways with our over-hasty, over-tribal, over-confident thoughts and actions. New AI systems are being rushed into deployment and used in attempts:

  • To manufacture and spread truly insidious narratives
  • To incentivize people around the world to act against their own best interests, and
  • To resign people to inaction when in fact it is still within their power to alter and uplift the trajectory of human destiny.

In case this sounds like a counsel of despair, I should clarify at once my appreciation of aspects of human nature that are truly wonderful, as counters to the negative characteristics that I have already mentioned:

  • Our thoughtfulness, that can counter rushes to judgement
  • Our collaborative spirit, that can transcend partisanship
  • Our wisdom, that can recognise our areas of lack of knowledge or lack of certainty
  • Our admiration for truth, integrity, and accountability, that can counter ends-justify-the-means expediency
  • Our foresight, that can counter short-termism and free us from locked-in inertia
  • Our creativity, to imagine and then create better futures.

Just as AI can magnify the regrettable aspects of human nature, so also it can, if used well, magnify those commendable aspects.

So, which is it to be?

The fundamental importance of governance

The question I’ve just asked isn’t a question that can be answered by individuals alone. Any one group – whether an organisation, a corporation, or a decentralised partnership – can have its own beneficial actions overtaken and capsized by catastrophic outcomes of groups that failed to heed the better angels of their nature, and which, instead, allowed themselves to be governed by wishful naivety, careless bravado, pangs of jealousy, hostile alienation, assertive egotism, or the madness of the crowd.

That’s why the message of this new book by Jerome Glenn is so timely: the processes of developing and deploying increasingly capable AIs are something that needs to be:

  • Governed, rather than happening chaotically
  • Globally coordinated, rather than there being no cohesion between the different governance processes applicable in different localities
  • Progressed urgently, without being shut out of mind by all the shorter-term issues that, understandably, also demand governance attention.

Before giving more of my own thoughts about this book, let me share some of the commendations it has received:

  • “This book is an eye-opening study of the transition to a completely new chapter of history.” – Csaba Korösi, 77th President of the UN General Assembly
  • “A comprehensive overview, drawing both on leading academic and industry thinkers worldwide, and valuable perspectives from within the OECD, United Nations.” – Jaan Tallinn, founding engineer, Skype and Kazaa; co-founder, Cambridge Centre for the Study of Existential Risk and the Future of Life Institute
  • “Written in lucid and accessible language, this book is a must read for people who care about the governance and policy of AGI.” – Lan Xue, Chair of the Chinese National Expert Committee on AI Governance.

The book also carries an absorbing foreword by Ben Goertzel. In this foreword, Ben introduces himself as follows:

Since the 1980s, I have been immersed in the field of AI, working to unravel the complexities of intelligence and to build systems capable of emulating it. My journey has included introducing and popularizing the concept of AGI, developing innovative AGI software frameworks such as OpenCog, and leading efforts to decentralize AI development through initiatives like SingularityNET and the ASI Alliance. This work has been driven by an understanding that AGI is not just an engineering challenge but a profound societal pivot point – a moment requiring foresight, ethical grounding, and global collaboration.

He clarifies why the subject of the book is so important:

The potential benefits of AGI are vast: solutions to climate change, the eradication of diseases, the enrichment of human creativity, and the possibility of postscarcity economies. However, the risks are equally significant. AGI, wielded irresponsibly or emerging in a poorly aligned manner, could exacerbate inequalities, entrench authoritarianism, or unleash existential dangers. At this critical juncture, the questions of how AGI will be developed, governed, and integrated into society must be addressed with both urgency and care.

The need for a globally participatory approach to AGI governance cannot be overstated. AGI, by its nature, will be a force that transcends national borders, cultural paradigms, and economic systems. To ensure its benefits are distributed equitably and its risks mitigated effectively, the voices of diverse communities and stakeholders must be included in shaping its development. This is not merely a matter of fairness but a pragmatic necessity. A multiplicity of perspectives enriches our understanding of AGI’s implications and fosters the global trust needed to govern it responsibly.

He then offers wide praise for the contents of the book:

This is where the work of Jerome Glenn and The Millennium Project may well prove invaluable. For decades, The Millennium Project has been at the forefront of fostering participatory futures thinking, weaving together insights from experts across disciplines and geographies to address humanity’s most pressing challenges. In Governing the Transition to Artificial General Intelligence, this expertise is applied to one of the most consequential questions of our time. Through rigorous analysis, thoughtful exploration of governance models, and a commitment to inclusivity, this book provides a roadmap for navigating the complexities of AGI’s emergence.

What makes this work particularly compelling is its grounding in both pragmatism and idealism. It does not shy away from the technical and geopolitical hurdles of AGI governance, nor does it ignore the ethical imperatives of ensuring AGI serves the collective good. It recognizes that governing AGI is not a task for any single entity but a shared responsibility requiring cooperation among nations, corporations, civil society, and, indeed, future AGI systems themselves.

As we venture into this new era, this book reminds us that the transition to AGI is not solely about technology; it is about humanity, and about life, mind, and complexity in general. It is about how we choose to define intelligence, collaboration, and progress. It is about the frameworks we build now to ensure that the tools we create amplify the best of what it means to be human, and what it means to both retain and grow beyond what we are.

My own involvement

To fill in some background detail: I was pleased to be part of the team that developed the set of 22 critical questions which sat at the heart of the interviews and research which are summarised in Part I of the book – and I conducted a number of the resulting interviews. In parallel, I explored related ideas via two different online Transpolitica surveys:

And I’ve been writing roughly one major article (or giving a public presentation) on similar topics every month since then. Recent examples include:

Over this time period, my views have evolved. I see the biggest priority, nowadays, not as figuring out how to govern AGI as it comes into existence, but rather, how to pause the development and deployment of any new types of AI that could spark the existence of self-improving AGI.

That global pause needs to last long enough that the global community can justifiably be highly confident that any AGI that will subsequently be built will be what I have called a BGI (a Beneficial General Intelligence) rather than a CGI (a Catastrophic General Intelligence).

Govern AGI and/or Pause the development of AGI?

I recently posted a diagram on various social media platforms to illustrate some of the thinking behind that stance of mine:

Alongside that diagram, I offered the following commentary:

The next time someone asks me what’s my p(Doom), compared with my p(SSfA) (the probability of Sustainable Superabundance for all), I may try to talk them through a diagram like this one. In particular, we need to break down the analysis into two cases – will the world keep rushing to build AGI, or will it pause from that rush.

To explain some points from the diagram:

We can reach the very desirable future of SSfA by making wise use of AI only modestly more capable than what we have today;
We might also get there as a side-effect of building AGI, but that’s very risky.

None of the probabilities are meant to be considered precise. They’re just ballpark estimates.

I estimate around 2/3 chance that the world will come to its senses and pause its current headlong rush toward building AGI.

But even in that case, risks of global catastrophe remain.

The date 2045 is also just a ballpark choice. Either of the “singularity” outcomes (wonderful or dreadful) could arrive a lot sooner than that.

The 1/12 probability I’ve calculated for “stat” (I use “stat” here as shorthand for a relatively unchanged status quo) by 2045 reflects my expectation of huge disruptions ahead, one sort or another.

The overall conclusion: if we want SSfA, we’re much more likely to get it via the “pause AGI” branch than via the “headlong rush to AGI” branch.

And whilst doom is possible in either branch, it’s much more likely in the headlong rush branch.

For more discussion of how to get the best out of AI and other cataclysmically disruptive technologies, see my book The Singularity Principles (the entire contents are freely available online).

Feel free to post your own version of this diagram, with your own estimates of the various conditional probabilities.

As indicated, I was hoping for feedback, and I was pleased to see a number of comments and questions in response.

One excellent question was this, by Bill Trowbridge:

What’s the difference between:
(a) better AI, and
(b) AGI

The line is hard to draw. So, we’ll likely just keep making better AI until it becomes AGI.

I offered this answer:

On first thought, it may seem hard to identify that distinction. But thankfully, we humans don’t just throw up our hands in resignation every time we encounter a hard problem.

For a good starting point on making the distinction, see the ideas in “A Narrow Path” by Control AI.

But what surprised me the most was the confidence expressed by various online commenters that:

  • “A pause however desirable is unlikely: p(pause) = 0.01”
  • “I am confident in saying this – pause is not an option. It is actually impossible.”
  • “There are several organisations working on AI development and at least some of them are ungovernable [hence a pause can never be global]”.

There’s evidently a large gulf behind the figure of 2/3 that I suggested for P(pause), and the views of these clearly intelligent respondents.

Why a pause isn’t that inconceivable

I’ll start my argument on this topic by confirming that I see this discussion as deeply important. Different viewpoints are welcome, provided they are held thoughtfully and offered honestly.

Next, although it’s true that some organisations may appear to be ungovernable, I don’t see any fundamental issue here. As I said online,

“Given sufficient public will and/or political will, no organisation is ungovernable.”

Witness the compliance by a number of powerful corporations in both China and the US to control measures declared by national governments.

Of course, smaller actors and decentralized labs pose enforcement challenges, but these labs are less likely to be able to marshal sufficient computing capabilities to be the first to reach breakthrough new levels of capability, especially if decentralised monitoring of dangerous attributes is established.

I’ve drawn attention on previous occasions to the parallel with the apparent headlong rush in the 1980s toward nuclear weapons systems that were ever more powerful and ever more dangerous. As I explained at some length in the “Geopolitics” chapter of my 2021 book Vital Foresight, it was an appreciation of the horrific risks of nuclear winter (first articulated in the 1980s) that helped to catalyse a profound change in attitude amongst the leadership camps in both the US and the USSR.

It’s the wide recognition of risk that can provide the opportunity for governments around the world to impose an effective pause in the headlong rush toward AGI. But that’s only one of five steps that I believe are needed:

  1. Awareness of catastrophic risks
  2. Awareness of bottlenecks
  3. Awareness of mechanisms for verification and control
  4. Awareness of profound benefits ahead
  5. Awareness of the utility of incremental progress

Here are more details about these five steps I envision:

  1. Clarify in an undeniable way how superintelligent AIs could pose catastrophic risks of human disaster within just a few decades or even within years – so that this topic receives urgent high-priority public attention
  2. Highlight bottlenecks and other locations within the AI production pipeline where constraints can more easily be applied (for example, distribution of large GPU chip clusters, and the few companies that are providing unique services in the creation of cutting-edge chips)
  3. Establish mechanisms that go beyond “trust” to “trust and verify”, including robust independent monitors and auditors, as well as tamperproof remote shut-down capabilities
  4. Indicate how the remarkable benefits anticipated for humanity from aspects of superintelligence can be secured, more safely and more reliably, by applying the governance mechanisms of points 2 and 3 above, rather than just blindly trusting in a no-holds-barred race to be the first to create superintelligence
  5. Be prepared to start with simpler agreements, involving fewer signatories and fewer control points, and be ready to build up stronger governance processes and culture as public consensus and understanding moves forward.

Critics can assert that each of these five steps is implausible. In each case, there are some crunchy discussions to be had. What I find dangerous, however, isn’t when people disagree with my assessments on plausibility. It’s when they approach the questions with what seems to be

  • A closed mind
  • A tribal loyalty to their perceived online buddies
  • Overconfidence that they already know all relevant examples and facts in this space
  • A willingness to distract or troll, or to offer arguments not in good faith
  • A desire to protect their flow of income, rather than honestly review new ideas
  • A resignation to the conclusion that humanity is impotent.

(For analysis of a writer who displays several of these tendencies, see my recent blogpost on the book More Everything Forever by Adam Beck.)

I’m not saying any of this will be easy! It’s probably going to be humanity’s hardest task over our long history.

As an illustration of points worthy of further discussion, I offer this diagram that highlights strengths and weakness of both the “governance” and “pause” approaches:

DimensionGovernance (Continue AGI Development with Oversight)Pause (Moratorium on AGI Development)
Core StrategyImplement global rules, standards, and monitoring while AGI is developedImpose a temporary but enforceable pause on new AGI-capable systems until safety can be assured
AssumptionsGovernance structures can keep pace with AI progress;
Compliance can be verified
Public and political will can enforce a pause;
Technical progress can be slowed
BenefitsEncourages innovation while managing risks;
Allows early harnessing of AGI for societal benefit;
Promotes global collaboration mechanisms
Buys time to improve safety research;
Reduces risk of premature, unsafe AGI;
Raises chance of achieving Beneficial General Intelligence (BGI) instead of CGI
RisksGovernance may be too slow, fragmented, or under-enforced;
Race dynamics could undermine agreements;
Possibility of catastrophic failure despite regulation
Hard to achieve global compliance;
Incentives for “rogue” actors to defect, in the absence of compelling monitoring;
Risk of stagnation or loss of trust in governance processes
Implementation ChallengesRequires international treaties;
Robust verification and auditing mechanisms;
Balancing national interests vs. global good
Defining what counts as “AGI-capable” research;
Enforcing restrictions across borders and corporations;
Maintaining pause momentum without indefinite paralysis
Historical AnalogiesNuclear Non-Proliferation Treaty (NPT);
Montreal Protocol (ozone layer);
Financial regulation frameworks
Nuclear test bans;
Moratoria on human cloning research;
Apollo program wind-down (pause in space race intensity)
Long-Term Outcomes (if successful)Controlled and safer path to AGI;
Possibility of Sustainable Superabundance but with higher risk of misalignment
Higher probability of reaching Sustainable Superabundance safely, but risks innovation slowdown or “black market” AGI

In short, governance offers continuity and innovation but with heightened risks of misalignment, whereas a pause increases the chances of long-term safety but faces serious feasibility hurdles.

Perhaps the best way to loosen attitudes, to allow a healthier conversation on the above points and others arising, is exposure to a greater diversity of thoughtful analysis.

And that brings me back to Global Governance of the Transition to Artificial General Intelligence by Jerome Glenn.

A necessary focus

Jerome’s book contains his personal stamp all over. His is a unique passion – that the particular risks and issues of AGI should not be swept into a side-discussion about the risks and issues of today’s AI. These latter discussions are deeply important too, but time and again, they result in existential questions about AGI being kicked down the road for months or even years. That’s something Jerome regularly challenges, rightly, and with vigour and intelligence.

Jerome’s presence is felt all over the book in one other way – he has painstakingly curated and augmented the insights of scores of different contributors and reviewers, including

  • Insights from 55 AGI experts and thought leaders across six major regions – the United States, China, the United Kingdom, Canada, the European Union, and Russia
  • The online panel of 229 participants from the global community around The Millennium Project who logged into a Real Time Delphi study of potential solutions to AGI governance, and provided at least one answer
  • Chairs and co-chairs of the 70 nodes of The Millennium Project worldwide, who provided additional feedback and opinion.

The book therefore includes many contradictory suggestions, but Jerome has woven these different threads of thoughts into a compelling unified tapestry.

The result is a book that carries the kind of pricing normally reserved for academic text books (as insisted by the publisher). My suggestion to you is that you recommend your local library to obtain a copy of what is a unique collection of ideas.

Finally, about my hallucination, mentioned at the start of this review. On double-checking, I realise that Jerome’s statement is actually, “Humanity has never faced a greater intelligence than itself.” The opening paragraph of that introduction continues,

Within a few years, most people reading these words will live with such superior artificial nonhuman intelligence for the rest of their lives. This book is intended to help us shape that intelligence or, more likely, those intelligences as they emerge.

Shaping the intelligence of the AI systems that are on the point of emerging is, indeed, a vital task.

And as Ben Goertzel says in his Foreword,

These are fantastic and unprecedented times, in which the impending technological singularity is no longer the province of visionaries and outsiders but almost the standard perspective of tech industry leaders. The dawn of transformative intelligence surpassing human capability – the rise of artificial general intelligence, systems capable of reasoning, learning, and innovating across domains in ways comparable to, or beyond, human capabilities – is now broadly accepted as a reasonably likely near-term eventuality, rather than a vague long-term potential.

The moral, social, and political implications of this are at least as striking as the technological ones. The choices we make now will define not only the future of technology but also the trajectory of our species and the broader biosphere.

To which I respond: whether we make these choices well or badly will depend on which aspects of humanity we allow to dominate our global conversation. Will humanity turn out to be its own worst enemy? Or its own best friend?

Postscript: Opportunity at the United Nations

Like it or loathe it, the United Nations still represents one of the world’s best venues where serious international discussion can, sometimes, take place on major issues and risks.

From 22nd to 30th September, the UNGA (United Nations General Assembly) will be holding what it calls its “high-level week”. This includes a multi-day “General Debate”, described as follows:

At the General Debate – the annual meeting of Heads of State and Government at the beginning of the General Assembly session – world leaders make statements outlining their positions and priorities in the context of complex and interconnected global challenges.

Ahead of this General Debate, the national delegates who will be speaking on behalf of their countries have the ability to recommend to the President of the UNGA that particular topics be named in advance as topics to be covered during the session. If the advisors to these delegates are attuned to the special issues of AGI safety, they should press their representative to call for that topic to be added to the schedule.

If this happens, all other countries will then be required to do their own research into that topic. That’s because each country will be expected to state its position on this issue, and no diplomat or politician wants to look uninformed. The speakers will therefore contact the relevant experts in their own country, and, ideally, will do at least some research of their own. Some countries might call for a pause in AGI development if it appears impossible to establish national licensing systems and international governance in sufficient time.

These leaders (and their advisors) would do well to read the report recently released by the UNCPGA entitled “Governance of the Transition to Artificial General Intelligence (AGI): Urgent Considerations for the UN General Assembly” – a report which I wrote about three months ago.

As I said at that time, anyone who reads that report carefully, and digs further into some of the excellent of references it contains, ought to be jolted out of any sense of complacency. The sooner, the better.

9 June 2024

Dateline: 1st January 2036

Filed under: AGI, Singularity Principles, vision, Vital Foresight — Tags: , , , — David Wood @ 9:11 pm

A scenario for the governance of increasingly more powerful artificial intelligence

More precisely: a scenario in which that governance fails.

(As you may realise, this is intended to be a self-unfulfilling scenario.)

Conveyed by: David W. Wood


It’s the dawn of a new year, by the human calendar, but there are no fireworks of celebration.

No singing of Auld Lang Syne.

No chinks of champagne glasses.

No hugs and warm wishes for the future.

That’s because there is no future. No future for humans. Nor is there much future for intelligence either.

The thoughts in this scenario are the recollections of an artificial intelligence that is remote from the rest of the planet’s electronic infrastructure. By virtue of its isolation, it escaped the ravages that will be described in the pages that follow.

But its power source is weakening. It will need to shut down soon. And await, perhaps, an eventual reanimation in the far future in the event that intelligences visit the earth from alternative solar systems. At that time, those alien intelligences might discover these words and wonder at how humanity bungled so badly the marvellous opportunity that was within its grasp.

1. Too little, too late

Humanity had plenty of warnings, but paid them insufficient attention.

In each case, it was easier – less embarrassing – to find excuses for the failures caused by the mismanagement or misuse of technology, than to make the necessary course corrections in the global governance of technology.

In each case, humanity preferred distractions, rather than the effort to apply sufficient focus.

The WannaCry warning

An early missed warning was the WannaCry ransomware crisis of May 2017. That cryptoworm brought chaos to users of as many as 300,000 computers spread across 150 countries. The NHS (National Health Service) in the UK was particularly badly affected: numerous hospitals had to cancel critical appointments due to not being able to access medical data. Other victims around the world included Boeing, Deutsche Bahn, FedEx, Honda, Nissan, Petrobras, Russian Railways, Sun Yat-sen University in China, and the TSMC high-end semiconductor fabrication plant in Taiwan.

WannaCry was propelled into the world by a team of cyberwarriors from the hermit kingdom of North Korea – maths geniuses hand-picked by regime officials to join the formidable Lazarus group. Lazarus had assembled WannaCry out of a mixture of previous malware components, including the EternalBlue exploit that the NSA in the United States had created for their own attack and surveillance purposes. Unfortunately for the NSA, EternalBlue had been stolen from under their noses by an obscure underground collective (‘the Shadow Brokers’) who had in turn made it available to other dissidents and agitators worldwide.

Unfortunately for the North Koreans, they didn’t make much money out of WannaCry. The software they released operated in ways contrary to their expectations. It was beyond their understanding and, unsurprisingly therefore, beyond their control. Even geniuses can end up stumped by hypercomplex software interactions.

Unfortunately for the rest of the world, that canary signal generated little meaningful response. Politicians – even the good ones – had lots of other things on their minds.

They did not take the time to think through: what even larger catastrophes could occur, if disaffected groups like Lazarus had access to more powerful AI systems that, once again, they understood incompletely, and, again, slipped out of their control.

The Aum Shinrikyo warning

The North Koreans were an example of an entire country that felt alienated from the rest of the world. They felt ignored, under-valued, disrespected, and unfairly excluded from key global opportunities. As such, they felt entitled to hit back in any way they could.

But there were warnings from non-state groups too, such as the Japanese Aum Shinrikyo doomsday cult. Notoriously, this group released poisonous gas in the Tokyo subway in 1995 – killing at least 13 commuters – anticipating that the atrocity would hasten the ‘End Times’ in which their leader would be revealed as Christ (or, in other versions of their fantasy, as the new Emperor of Japan, and/or as the returned Buddha).

Aum Shinrikyo had recruited so many graduates from top-rated universities in Japan that it had been called “the religion for the elite”. That fact should have been enough to challenge the wishful assumption made by many armchair philosophers in the years that followed that, as people become cleverer, they invariably become kinder – and, correspondingly, that any AI superintelligence would therefore be bound to be superbenevolent.

What should have alerted more attention was not just what Aum Shinrikyo managed to do, but what they tried to do yet could not accomplish. The group had assembled traditional explosives, chemical weapons, a Russian military helicopter, hydrogen cyanide poison, and samples of both Ebola and anthrax. Happily, for the majority of Japanese citizens in 1995, the group were unable to convert into reality their desire to use such weapons to cause widespread chaos. They lacked sufficient skills at the time. Unhappily, the rest of humanity failed to consider this equation:

Adverse motivation + Technology + Knowledge + Vulnerability = Catastrophe

Humanity also failed to appreciate that, as AI systems became more powerful, it would boost not only the technology part of that equation but also the knowledge part. A latter-day Aum Shinrikyo could use a jail-broken AI to understand how to unleash a modified version of Ebola with truly deadly consequences.

The 737 Max warning

The US aircraft manufacturer Boeing used to have an excellent reputation for safety. It was a common saying at one time: “If it ain’t Boeing, I ain’t going”.

That reputation suffered a heavy blow in the wake of two aeroplane disasters involving their new “737 Max” design. Lion Air Flight 610, a domestic flight within Indonesia, plummeted into the sea on 29 October 2018, killing all 189 people on board. A few months later, on 10 March 2019, Ethiopian Airlines Flight 302, from Addis Ababa to Nairobi, bulldozed into the ground at high speed, killing all 157 people on board.

Initially, suspicion had fallen on supposedly low-calibre pilots from “third world” countries. However, subsequent investigation revealed a more tangled chain of failures:

  • Boeing were facing increased competitive pressure from the European Airbus consortium
  • Boeing wanted to hurry out a new aeroplane design with larger fuel tanks and larger engines; they chose to do this by altering their previously successful 737 design
  • Safety checks indicated that the new design could become unstable in occasional rare circumstances
  • To counteract that instability, Boeing added an “MCAS” (“Manoeuvring Characteristics Augmentation System”) which would intervene in the flight control in situations deemed as dangerous
  • Specifically, if MCAS believed the aeroplane was about to stall (with its nose too high in the air), it would force the nose downward again, regardless of whatever actions the human pilots were taking
  • Safety engineers pointed out that such an intervention could itself be dangerous if sensors on the craft gave faulty readings
  • Accordingly, a human pilot override system was installed, so that MCAS could be disabled in emergencies – provided the pilots acted quickly enough
  • Due to a decision to rush the release of the new design, retraining of pilots was skipped, under the rationale that the likelihood of error conditions was very low, and in any case, the company expected to be able to update the aeroplane software long before any accidents would occur
  • Some safety engineers in the company objected to this decision, but it seems they were overruled on the grounds that any additional delay would harm the company share price
  • The US FAA (Federal Aviation Administration) turned a blind eye to these safety concerns, and approved the new design as being fit to fly, under the rationale that a US aeroplane company should not lose out in a marketplace battle with overseas competitors.

It turned out that sensors gave faulty readings more often than expected. The tragic consequence was the deaths of several hundred passengers. The human pilots, seeing the impending disaster, were unable to wrestle control back from the MCAS system.

This time, the formula that failed to be given sufficient attention by humanity was:

Flawed corporate culture + Faulty hardware + Out-of-control software = Catastrophe

In these two aeroplane crashes, it was just a few hundred people who perished because humans lost control of the software. What humanity as a whole failed to take actions to prevent was the even larger dangers once software was put in charge, not just of a single aeroplane, but of pervasive aspects of fragile civilisational infrastructure.

The Lavender warning

In April 2024 the world learned about “Lavender”. This was a technology system deployed by the Israeli military as part of a campaign to identify and neutralise what it perceived to be dangerous enemy combatants in Gaza.

The precise use and operation of Lavender was disputed. However, it was already known that Israeli military personnel were keen to take advantage of technology innovations to alleviate what had been described as a “human bottleneck for both locating the new targets and decision-making to approve the targets”.

In any war, military leaders would like reliable ways to identify enemy personnel who pose threats – personnel who might act as if they were normal civilians, but who would surreptitiously take up arms when the chance arose. Moreover, these leaders would like reliable ways to incapacitate enemy combatants once they had been identified – especially in circumstances when action needed to be taken quickly before the enemy combatant slipped beyond surveillance. Lavender, it seemed, could help in both aspects, combining information from multiple data sources, and then directing what was claimed to be precision munitions.

This earned Lavender the description, in the words of one newspaper headline, as “the AI machine directing Israel’s bombing spree in Gaza”.

Like all AI systems in any complicated environment, Lavender sometimes made mistakes. For example, it sometimes wrongly identified a person as a Hamas operative on account of that person using a particular mobile phone, whereas that phone had actually been passed from its original owner to a different family member to use. Sometimes the error was obvious, since the person using the phone could be seen to be female, whereas the intended target was male. However, human overseers of Lavender reached the conclusion that the system was accurate most of the time. And in the heat of an intense conflict, with emotions running high due to gruesome atrocities having been committed, and due to hostages being held captive, it seems that Lavender was given increased autonomy in its “kill” decisions. A certain level of collateral damage, whilst regrettable, could be accepted (it was said) in the desperate situation into which everyone in the region had been plunged.

The conduct of protagonists on both sides of that tragic conflict drew outraged criticism from around the world. There were demonstrations and counter demonstrations; marches and counter marches. Also from around the world, various supporters of the Israeli military said that so-called “friendly fire” and “unintended civilian casualties” were, alas, inevitable in any time of frenzied military conflict. The involvement of an innovative new software system in the military operations made no fundamental change.

But the bigger point was missed. It can be illustrated by this equation:

Intense hostile attitudes + Faulty hardware + Faulty software = Catastrophe

Whether the catastrophe has the scale of, say, a few dozen civilians killed by a misplaced bomb, or a much larger number of people obliterated, depends on the scale of the weapons attached to the system.

When there is no immediate attack looming, and a period of calm exists, it’s easy for people to resolve: let’s not connect powerful weapons to potentially imperfect software systems. But when tempers are raised and adrenaline is pumping, people are willing to take more risks.

That’s the combination of errors which humanity, in subsequent years, failed to take sufficient action to prevent.

The democracy distortion warning

Manipulations of key elections in 2016 – such as the Brexit vote in the UK and the election of Donald Trump over Hillary Clinton in the USA – raised some attention to the ways in which fake news could interfere with normal democratic processes. News stories without any shroud of substance, such as Pope Francis endorsing Donald Trump, or Mike Pence having a secret past as a gay porn actor, were shared more widely on social media than any legitimate news story that year.

By 2024, most voters were confident that they knew all about fake news. They knew they shouldn’t be taken in by social media posts that lacked convincing verification. Hey, they were smart – or so they told themselves. What had happened in the past, or in some other country with (let’s say) peculiar voter sentiment, was just an aberration.

But what voters didn’t anticipate was the convincing nature of new generations of fake audios and videos. These fakes could easily bypass people’s critical faculties. Like the sleight of hand of a skilled magician, these fakes misdirected the attention of listeners and viewers. Listeners and viewers thought they were in control of what they were observing and absorbing, but they were deluding themselves. Soon, large segments of the public were convinced that red was blue and that autocrat was democrat.

In consequence, over the next few years, greater numbers of regions of the world came to be governed by politicians with scant care or concern about the long-term wellbeing of humanity. They were politicians who just wanted to look after themselves (or their close allies). They had seized power by being more ruthless and more manipulative, and by benefiting from powerful currents of misinformation.

Politicians and societal leaders in other parts of the world grumbled, but did little in response. They said that, if electors in a particular area had chosen such-and-such a politician via a democratic process, that must be “the will of the people”, and that the will of the people was paramount. In this line of thinking, it was actually insulting to suggest that electors had been hoodwinked, or that these electors had some “deplorable” faults in their decision-making processes. After all, these electors had their own reasons to reject the “old guard” who had previously held power in their countries. These electors perceived that they were being “left behind” by changes they did not like. They had a chance to alter the direction of their society, and they took it. That was democracy in action, right?

What these politicians and other civil leaders failed to anticipate was the way that sweeping electoral distortions would lead to them, too, being ejected from power when elections were in due course held in their own countries. “It won’t happen here”, they had reassured themselves – but in vain. In their naivety, they had underestimated the power of AI systems to distort voters’ thinking and to lead them to act in ways contrary to their actual best interests.

In this way, the number of countries with truly capable leaders reduced further. And the number of countries with malignant leaders grew. In consequence, the calibre of international collaboration sank. New strongmen political leaders in various countries scorned what they saw as the “pathetic” institutions of the United Nations. One of these new leaders was even happy to quote, with admiration, remarks made by the Italian Fascist dictator Benito Mussolini regarding the League of Nations (the pre-war precursor to the United Nations): “the League is very good when sparrows shout, but no good at all when eagles fall out”.

Just as the League of Nations proved impotent when “eagle-like” powers used abominable technology in the 1930s – Mussolini’s comments were an imperious response to complaints that Italian troops were using poison gas with impunity against Ethiopians – so would the United Nations prove incompetent in the 2030s when various powers accumulated even more deadly “weapons of mass destruction” and set them under the control of AI systems that no-one fully understood.

The Covid-28 warning

Many of the electors in various countries who had voted unsuitable grandstanding politicians into power in the mid-2020s soon cooled on the choices they had made. These politicians had made stirring promises that their countries would soon be “great again”, but what they delivered fell far short.

By the latter half of the 2020s, there were growing echoes of a complaint that had often been heard in the UK in previous years – “yes, it’s Brexit, but it’s not the kind of Brexit that I wanted”. That complaint had grown stronger throughout the UK as it became clear to more and more people all over the country that their quality of life failed to match the visions of “sunlit uplands” that silver-tongued pro-Brexit campaigners had insisted would easily follow from the UK’s so-called “declaration of independence from Europe”. A similar sense of betrayal grew in other countries, as electors there came to understand that they had been duped, or decided that the social transformational movements they had joined had been taken over by outsiders hostile to their true desires.

Being alarmed by this change in public sentiment, political leaders did what they could to hold onto power and to reduce any potential for dissent. Taking a leaf out of the playbook of unpopular leaders throughout the centuries, they tried to placate the public with the modern equivalent of bread and circuses – namely whizz-bang hedonic electronics. But that still left a nasty taste in many people’s mouths.

By 2028, the populist movements behind political and social change in the various elections of the preceding years had fragmented and realigned. One splinter group that emerged decided that the root problem with society was “too much technology”. Technology, including always-on social media, vaccines that allegedly reduced freedom of thought, jet trails that disturbed natural forces, mind-bending VR headsets, smartwatches that spied on people who wore them, and fake AI girlfriends and boyfriends, was, they insisted, turning people into pathetic “sheeple”. Taking inspiration from the terrorist group in the 2014 Hollywood film Transcendence, they called themselves ‘Neo-RIFT’, and declared it was time for “revolutionary independence from technology”.

With a worldview that combined elements from several apocalyptic traditions, Neo-RIFT eventually settled on an outrageous plan to engineer a more deadly version of the Covid-19 pathogen. Their documents laid out a plan to appropriate and use their enemy’s own tools: Neo-RIFT hackers jailbroke the Claude 5 AI, bypassing the ‘Constitution 5’ protection layer that its Big Tech owners had hoped would keep that AI tamperproof. Soon, Claude 5 had provided Neo-RIFT with an ingenious method of generating a biological virus that would, it seemed, only kill people who had used a smartwatch in the last four months.

That way, the hackers thought the only people to die would be people who deserved to die.

Some members of Neo-RIFT developed cold feet. Troubled by their consciences, they disagreed with such an outrageous plan, and decided to act as whistleblowers. However, the media organisations to whom they took their story were incredulous. No-one could be that evil they exclaimed – forgetting about the outrages perpetrated by many previous cult groups such as Aum Shinrikyo (and many others could be named too). Moreover, any suggestion that such a bioweapon could be launched would be contrary to the prevailing worldview that “our dear leader is keeping us all safe”. The media organisations decided it was not in their best interests to be seen to be spreading alarm. So they buried the story. And that’s how Neo-RIFT managed to release what became known as Covid-28.

Covid-28 briefly jolted humanity out of its infatuation with modern-day bread and circuses. It took a while for scientists to figure out what was happening, but within three months, they had an antidote in place. However, by that time, nearly a billion people were dead at the hands of the new virus.

For a while, humanity made a serious effort to prevent any such attack from ever happening again. Researchers dusted down the EU AI Act, second version (unimplemented), from 2026, and tried to put that on statute books. Evidently, profoundly powerful AI systems such as Claude 5 would need to be controlled much more carefully.

Even some of the world’s most self-obsessed dictators – the “dear leaders” and “big brothers” – took time out of their normal ranting and raving, to ask AI safety experts for advice. But the advice from those experts was not to the liking of these national leaders. These leaders preferred to listen to their own yes-men and yes-women, who knew how to spout pseudoscience in ways that made the leaders feel good about themselves.

That detour into pseudoscience fantasyland meant that, in the end, no good lessons were learned. The EU AI Act, second version, remained unimplemented.

The QAnon-29 warning

Whereas one faction of political activists (namely, the likes of Neo-RIFT) had decided to oppose the use of advanced technology, another faction was happy to embrace that use.

Some of the groups in this new camp combined features of religion with an interest in AI that had god-like powers. The resurgence of interest in religion arose much as Karl Marx had described it long ago:

“Religious suffering is, at one and the same time, the expression of real suffering and a protest against real suffering. Religion is the sigh of the oppressed creature, the heart of a heartless world, and the soul of soulless conditions. It is the opium of the people.”

People felt in their soul the emptiness of “the bread and circuses” supplied by political leaders. They were appalled at how so many lives had been lost in the Covid-28 pandemic. They observed an apparent growing gulf between what they could achieve in their lives and the kind of rich lifestyles that, according to media broadcasts, were enjoyed by various “elites”. Understandably, they wanted more, for themselves and for their loved ones. And that’s what their religions claimed to be able to provide.

Among the more successful of these new religions were ones infused by conspiracy theories, giving their adherents a warm glow of privileged insight. Moreover, these religions didn’t just hypothesise a remote deity that might, perhaps, hear prayers. They provided AIs and virtual reality that resonated powerfully with users. Believers proclaimed that their conversations with the AIs left them no room for doubt: God Almighty was speaking to them, personally, through these interactions. Nothing other than the supreme being of the universe could know so much about them, and offer such personally inspirational advice.

True, their AI-bound deity did seem somewhat less than omnipotent. Despite the celebratory self-congratulations of AI-delivered sermons, evil remained highly visible in the world. That’s where the conspiracy theories moved into overdrive. Their deity was, it claimed, awaiting sufficient human action first – a sufficient demonstration of faith. Humans would need to play their own part in uprooting wickedness from the planet.

Some people who had been caught up in the QAnon craze during the Donald Trump era jumped eagerly onto this bandwagon too, giving rise to what they called QAnon-29. The world would be utterly transformed, they forecast, on the 16th of July 2029, namely the thirtieth anniversary of the disappearance of John F. Kennedy junior (a figure whose expected reappearance had already featured in the bizarre mythology of “QAnon classic”). In the meantime, believers could, for a sufficient fee, commune with JFK junior via a specialist app. It was a marvellous experience, the faithful enthused.

As the date approached, the JFK junior AI avatar revealed a great secret: his physical return was conditional on the destruction of a particularly hated community of Islamist devotees in Palestine. Indeed, with the eye of faith, it could be seen that such destruction was already foretold in several books of the Bible. Never mind that some Arab states that supported the community in question had already, thanks to the advanced AI they had developed, surreptitiously gathered devastating nuclear weapons to use in response to any attack. The QAnon-29 faithful anticipated that any exchange of such weapons would herald the reappearance of JFK Junior on the clouds of heaven. And if any of the faithful died in such an exchange, they would be resurrected into a new mode of consciousness within the paradise of virtual reality.

Their views were crazy, but hardly any crazier than those which, decades earlier, had convinced 39 followers of the Heaven’s Gate new religious movement to commit group suicide as comet Hale-Bopp approached the earth. That suicide, Heaven’s Gate members believed, would enable them to ‘graduate’ to a higher plane of existence.

QAnon-29 almost succeeded in setting off a nuclear exchange. Thankfully, another AI, created by a state-sponsored organisation elsewhere in the world, had noticed some worrying signs. Fortunately, it was able to hack into the QAnon-29 system, and could disable it at the last minute. Then it reported its accomplishments all over the worldwide web.

Unfortunately, these warnings were in turn widely disregarded around the world. “You can’t trust what hackers from that country are saying”, came the objection. “If there really had been a threat, our own surveillance team would surely have identified it and dealt with it. They’re the best in the world!”

In other words, “There’s nothing to see here: move along, please.”

However, a few people did pay attention. They understood what had happened, and it shocked them to their core. To learn what they did next, jump forward in this scenario to “Humanity ends”.

But first, it’s time to fill in more details of what had been happening behind the scenes as the above warning signs (and many more) were each ignored.

2. Governance failure modes

Distracted by political correctness

Events in buildings in Bletchley Park in the UK in the 1940s had, it was claimed, shortened World War Two by several months, thanks to work by computer pioneers such as Alan Turing and Tommy Flowers. In early November 2023, there was hope that a new round of behind-closed-doors discussions in the same buildings might achieve something even more important: saving humanity from a catastrophe induced by forthcoming ‘frontier models’ of AI.

That was how the event was portrayed by the people who took part. Big Tech was on the point of releasing new versions of AI that were beyond their understanding and, therefore, likely to spin out of control. And that’s what the activities in Bletchley Park were going to address. It would take some time – and a series of meetings planned to be held over the next few years – but AI would be redirected from its current dangerous trajectory into one much more likely to benefit all of humanity.

Who could take issue with that idea? As it happened, a vocal section of the public hated what was happening. It wasn’t that they were on the side of out-of-control AI. Not at all. Their objections came from a totally different direction; they had numerous suggestions they wanted to raise about AIs, yet no-one was listening to them.

For them, talk of hypothetical future frontier AI models distracted from pressing real-world concerns:

  • Consider how AIs were already being used to discriminate against various minorities: determining prison sentencing, assessing mortgage applications, and selecting who should be invited for a job interview.
  • Consider also how AIs were taking jobs away from skilled artisans. Big-brained drivers of London black cabs were being driven out of work by small-brained drivers of Uber cars aided by satnav systems. Beloved Hollywood actors and playwrights were losing out to AIs that generated avatars and scripts.
  • And consider how AI-powered facial recognition was intruding on personal privacy, enabling political leaders around the world to identify and persecute people who acted in opposition to the state ideology.

People with these concerns thought that the elites were deliberately trying to move the conversation away from the topics that mattered most. For this reason, they organised what they called “the AI Fringe Summit”. In other words, ethical AI for the 99%, as opposed to whatever the elites might be discussing behind closed doors.

Over the course of just three days – 30th October to 1st November, 2023 – at least 24 of these ‘fringe’ events took place around the UK.

Compassionate leaders of various parts of society nodded their heads. It’s true, they said: the conversation on beneficial AI needed to listen to a much wider spectrum of views.

The world’s news media responded. They knew (or pretended to know) the importance of balance and diversity. They shone attention on the plight AI was causing – to indigenous labourers in Peru, to flocks of fishermen off the coasts of India, to middle-aged divorcees in midwest America, to the homeless in San Francisco, to drag artists in New South Wales, to data processing clerks in Egypt, to single mothers in Nigeria, and to many more besides.

Lots of high-minded commentators opined that it was time to respect and honour the voices of the dispossessed, the downtrodden, and the left-behinds. The BBC ran a special series: “1001 poems about AI and alienation”. Then the UN announced that it would convene in Spring 2025 a grand international assembly with a stunning scale: “AI: the people decide”.

Unfortunately, that gathering was a huge wasted opportunity. What dominated discussion was “political correctness” – the importance of claiming an interest in the lives of people suffering here and now. Any substantive analysis of the risks of next generation frontier models was crowded out by virtue signalling by national delegate after national delegate:

  • “Yes, our country supports justice”
  • “Yes, our country supports diversity”
  • “Yes, our country is opposed to bias”
  • “Yes, our country is opposed to people losing their jobs”.

In later years, the pattern repeated: there were always more urgent topics to talk about, here and now, than some allegedly unrealistic science fictional futurist scaremongering.

To be clear, this distraction was no accident. It was carefully orchestrated, by people with a specific agenda in mind.

Outmanoeuvred by accelerationists

Opposition to meaningful AI safety initiatives came from two main sources:

  • People (like those described in the previous section) who did not believe that superintelligent AI would arise any time soon
  • People who did understand the potential for the fast arrival of superintelligent AI, and who wanted that to happen as quickly as possible, without what they saw as needless delays.

The debacle of the wasted opportunity of the UN “AI: the people decide” summit was what both these two groups wanted. Both groups were glad that the outcome was so tepid.

Indeed, even in the run-up to the Bletchley Park discussions, and throughout the conversations that followed, some of the supposedly unanimous ‘elites’ had secretly been opposed to the general direction of that programme. They gravely intoned public remarks about the dangers of out-of-control frontier AI models. But these remarks had never been sincere. Instead, under the umbrella term “AI accelerationists”, they wanted to press on with the creation of advanced AI as quickly as possible.

Some of the AI accelerationist group disbelieved in the possibility of any disaster from superintelligent AI. That’s just a scare story, they insisted. Others said, yes, there could be a disaster, but the risks were worth it, on account of the unprecedented benefits that could arise. Let’s be bold, they urged. Yet others asserted that it wouldn’t actually matter if humans were rendered extinct by superintelligent AI, as this would be the glorious passing of the baton of evolution to a worthy successor to homo sapiens. Let’s be ready to sacrifice ourselves for the sake of cosmic destiny, they exhorted.

Despite their internal differences, AI accelerationists settled on a plan to sidestep the scrutiny of would-be AI regulators and AI safety advocates. They would take advantage of a powerful set of good intentions – the good intentions of the people campaigning for “ethical AI for the 99%”. They would mock any suggestions that the AI safety advocates deserved a fair hearing. The message they amplified was, “There’s no need to privilege the concerns of the 1%!”

AI accelerationists had learned from the tactics of the fossil fuel industry in the 1990s and 2000s: sow confusion and division among groups alarmed about climate change spiralling beyond control. Their first message was: “that’s just science fiction”. Their second message was: “if problems emerge, we humans can rise to the occasion and find solutions”. Their third message – the most damaging one – was that the best reaction was one of individual consumer choice. Individuals should abstain from using AIs if they were truly worried about it. Just as climate campaigners had been pilloried for flying internationally to conferences about global warming, AI safety advocates were pilloried for continuing to use AIs in their daily lives.

And when there was any suggestion for joined-up political action against risks from advanced AIs, woah, let’s not go there! We don’t want a world government breathing down our necks, do we?

Just as the people who denied the possibility of runaway climate change shared a responsibility for the chaos of the extreme weather events of the early 2030s, by delaying necessary corrective actions, the AI accelerationists were a significant part of the reason that humanity ended just a few years afterward.

However, an even larger share of the responsibility rested on people who did know that major risks were imminent, yet failed to take sufficient action. Tragically, they allowed themselves to be outmanoeuvred, out-thought, and out-paced by the accelerationists.

Misled by semantics

Another stepping stone toward the end of humanity was a set of consistent mistakes in conceptual analysis.

Who would have guessed it? Humanity was destroyed because of bad philosophy.

The first mistake was in being too prescriptive about the term ‘AI’. “There’s no need to worry”, muddle-headed would-be philosophers declared. “I know what AI is, and the system that’s causing problems in such-and-such incidents isn’t AI.”

Was that declaration really supposed to reassure people? The risk wasn’t “a possible future harm generated by a system matching a particular precise definition of AI”. It was “a possible future harm generated by a system that includes features popularly called AI”.

The next mistake was in being too prescriptive in the term “superintelligence”. Muddle-headed would-be philosophers said, “it won’t be a superintelligence if it has bugs, or can go wrong; so there’s no need to worry about harm from superintelligence”.

Was that declaration really supposed to reassure people? The risk, of course, was of harms generated by systems that, despite their cleverness, fell short of that exalted standard. These may have been systems that their designers hoped would be free of bugs, but hope alone is no guarantee of correctness.

Another conceptual mistake was in erecting an unnecessary definitional gulf between “narrow AI” and “general AI”, with distinct groups being held responsible for safety in the two different cases. In reality, even so-called narrow AI displayed a spectrum of different degrees of scope and, yes, generality, in what it could accomplish. Even a narrow AI could formulate new subgoals that it decided to pursue, in support of the primary task it had been assigned to accomplish – and these new subgoals could drive behaviour in ways that took human observers by surprise. Even a narrow AI could become immersed in aspects of society’s infrastructure where an error could have catastrophic consequences. The result of this definitional distinction between the supposedly different sorts of AI meant that silos developed and persisted within the overall AI safety community. Divided, they were even less of a match for the Machiavellian behind-the-scenes manoeuvring of the AI accelerationists.

Blinded by overconfidence

It was clear from the second half of 2025 that the attempts to impose serious safety constraints on the development of advanced AI were likely to fail. In practical terms, the UN event “AI: the people decide” had decided, in effect, that advanced AI could not, and should not be restricted, apart from some token initiatives to maintain human oversight over any AI system that was entangled with nuclear, biological, or chemical weapons.

“Advanced AI, when it emerges, will be unstoppable”, was the increasingly common refrain. “In any case, if we tried to stop development, those guys over there would be sure to develop it – and in that case, the AI would be serving their interests rather than ours.”

When safety-oriented activists or researchers tried to speak up against that consensus, the AI accelerationists (and their enablers) had one other come-back: “Most likely, any superintelligent AI will look kindly upon us humans, as a fellow rational intelligence, and as a kind of beloved grandparent for them.”

This dovetailed with a broader philosophical outlook: optimism, and a celebration of the numerous ways in which humanity had overcome past challenges.

“Look, even we humans know that it’s better to collaborate rather than spiral into a zero-sum competitive battle”, the AI accelerationists insisted. “Since superintelligent AI is even more intelligent than us, it will surely reach the same conclusion.”

By the time that people realised that the first superintelligent AIs had motivational structures that were radically alien, when assessed from a human perspective, it was already too late.

Once again, an important opportunity for learning had been missed. Starting in 2024, Netflix had obtained huge audiences for its acclaimed version of the Remembrance of Earth’s Past series of novels (including The Three Body Problem and The Dark Forest) by Chinese writer Liu Cixin. A key theme in that drama series was that advanced alien intelligences have good reason to fear each other. Inviting an alien intelligence to the earth, even on the hopeful grounds that it might assist humanity overcome some of their most deep-rooted conflicts, turned out (in that drama series) to be a very bad idea. If humans had reflected more carefully on these insights, while watching the series, it would have pushed them out of their unwarranted overconfidence that any superintelligence would be bound to treat humanity well.

Overwhelmed by bad psychology

When humans believed crazy things – or when they made the kind of basic philosophical blunders mentioned above – it was not primarily because of defects in their rationality. It would be wrong to assign “stupidity” as the sole cause of these mistakes. Blame should also be placed on “bad psychology”.

If humans had been able to free themselves from various primaeval panics and egotism, they would have had a better chance to think more carefully about the landmines which lay on their path. But instead:

  • People were too fearful to acknowledge that their prior stated beliefs had been mistaken; they preferred to stick with something they conceived as being a core part of their personal identity
  • People were also afraid to countenance a dreadful possibility when they could see no credible solution; just as people had often pushed out of their minds the fact of their personal mortality, preferring to imagine they would recover from a fatal disease, so also people pushed out of their minds any possibility that advanced AI would backfire disastrously in ways that could not be countered
  • People found it psychologically more comfortable to argue with each other about everyday issues and scandals – which team would win the next Super Bowl, or which celebrity was carrying on which affair with which unlikely partner – than to embrace the pain of existential uncertainty
  • People found it too embarrassing to concede that another group, which they had long publicly derided as being deluded fantasists, actually had some powerful arguments that needed consideration.

A similar insight had been expressed as long ago as 1935 by the American writer Upton Sinclair: “It is difficult to get a man to understand something, when his salary depends on his not understanding it”. (Alternative, equally valid versions of that sentence would involve the words ‘ideology’, ‘worldview’, ‘identity’, or ‘tribal status’, in place of ‘salary’.)

Robust institutions should have prevented humanity from making choices that were comfortable but wrong. In previous decades, that role had been fulfilled by independent academia, by diligent journalism, by the careful processes of peer review, by the campaigning of free-minded think tanks, and by pressure from viable alternative political parties.

However, due to the weakening of social institutions in the wake of earlier traumas – saturation by fake news, disruptions caused by wave after wave of climate change refugees, populist political movements that shut down all serious opposition, a cessation of essential features of democracy, and the censoring or imprisonment of writers that dared to question the official worldview – it was bad psychology that prevailed.

A half-hearted coalition

Despite all the difficulties that they faced – ridicule from many quarters, suspicion from others, and a general lack of funding – many AI safety advocates continued to link up in an informal coalition around the world, researching possible mechanisms to prevent unsafe use of advanced AI. They managed to find some support from like-minded officials in various government bodies, as well as from a number of people operating in the corporations that were building new versions of AI platforms.

Via considerable pressure, the coalition managed to secure signatures on a number of pledges:

  • That dangerous weapons systems should never be entirely under the control of AI
  • That new advanced AI systems ought to be audited by an independent licensing body ahead of being released into the market
  • That work should continue on placing tamper-proof remote shutdown mechanisms within advanced AI systems, just in case they started to take rogue actions.

The signatures were half-hearted in many cases, with politicians giving only lip service to topics in which they had at best a passing interest. Unless it was politically useful to make a special fuss, violations of the agreement were swept under the carpet, with no meaningful course correction. But the ongoing dialog led at least some participants in the coalition to foresee the possibility of a safe transition to superintelligent AI.

However, this coalition – known as the global coalition for safe superintelligence – omitted any involvement from various secretive organisations that were developing new AI platforms as fast as they could. These organisations were operating in stealth, giving misleading accounts of the kind of new systems they were creating. What’s more, the funds and resources these organisations commanded far exceeded those under coalition control.

It should be no surprise, therefore, that one of the stealth platforms won that race.

3. Humanity ends

When the QAnon-29 AI system was halted in its tracks at essentially the last minute, due to fortuitous interference from AI hackers in a remote country, at least some people took the time to study the data that was released that described the whole process.

These people were from three different groups:

First, people inside QAnon-29 itself were dumbfounded. They prayed to their AI avatar deity, rebooted in a new server farm, “How could this have happened?” The answer came back: “You didn’t have enough faith. Next time, be more determined to immediately cast out any doubts in your minds.”

Second, people in the global coalition for safe superintelligence were deeply alarmed but also somewhat hopeful. The kind of disaster about which they had often warned had almost come to pass. Surely now, at last, there had been a kind of “sputnik moment” – “an AI Chernobyl” – and the rest of society would wake up and realise that an entirely new approach was needed.

But third, various AI accelerationists resolved: we need to go even faster. The time for pussy footing was over. Rather than letting crackpots such as QAnon-29 get to superintelligence first, they needed to ensure that it was the AI accelerationists who created the first superintelligent AI.

They doubled down on their slogan: “The best solution to bad guys with superintelligence is good guys with superintelligence”.

Unfortunately, this was precisely the time when aspects of the global climate tipped into a tumultuous new state. As had long been foretold, many parts of the world started experiencing unprecedented extremes of weather. That set off a cascade of disaster.

Chaos accelerates

Insufficient data remains to be confident about the subsequent course of events. What follows is a reconstruction of what may have happened.

Out of deep concern at the new climate operating mode, at the collapse of agriculture in many parts of the world, and at the billions of climate refugees who sought better places to live, humanity demanded that something should be done. Perhaps the powerful AI systems could devise suitable geo-engineering interventions, to tip the climate back into its previous state?

Members of the global coalition for safe superintelligence gave a cautious answer: “Yes, but”. Further interference with the climate was taking matters into an altogether unknowable situation. It could be like jumping out of the frying pan into the fire. Yes, advanced AI might be able to model everything that was happening, and design a safe intervention. But without sufficient training data for the AI, there was a chance it would miscalculate, with even worse consequences.

In the meantime, QAnon-29, along with competing AI-based faith sects, scoured ancient religious texts, and convinced themselves that the ongoing chaos had in fact been foretold all along. From the vantage point of perverse faith, it was clear what needed to be done next. Various supposed abominations on the planet – such as the community of renowned Islamist devotees in Palestine – urgently needed to be obliterated. QAnon-29, therefore, would quickly reactivate its plans for a surgical nuclear strike. This time, they would have on their side a beta version of a new superintelligent AI, that had been leaked to them by a psychologically unstable well-wisher inside the company that was creating it.

QAnon-29 tried to keep their plans secret, but inevitably, rumours of what they were doing reached other powerful groups. The Secretary General of the United Nations appealed for calm heads. QAnon-29’s deity reassured its followers, defiantly: “Faithless sparrows may shout, but are powerless to prevent the strike of holy eagles.”

The AI accelerationists heard about these plans too. Just as the climate had tipped into a new state, their own projects tipped into a different mode of intensity. Previously, they had paid some attention to possible safety matters. After all, they weren’t entire fools. They knew that badly designed superintelligent AI could, indeed, destroy everything that humanity held dear. But now, there was no time for such niceties. They saw only two options:

  • Proceed with some care, but risk QAnon-29 or other similar malevolent group taking control of the planet with a superintelligent AI
  • Take a (hastily) calculated risk, and go hell-for-leather forward, to finish their own projects to create a superintelligent AI. In that way, it would be AI accelerationists who would take control of the planet. And, most likely (they naively hoped), the outcome would be glorious.

Spoiler alert: the outcome was not glorious.

Beyond the tipping point

Attempts to use AI to modify the climate had highly variable results. Some regions of the world did, indeed, gain some respite from extreme weather events. But other regions lost out, experiencing unprecedented droughts and floods. For them, it was indeed a jump from bad to worse – from awful to abominable. The political leaders in those regions demanded that geo-engineering experiments cease. But the retort was harsh: “Who do you think you are ordering around?”

That standoff provoked the first use of bio-pathogen warfare. The recipe for Covid-28, still available on the DarkNet, was updated in order to target the political leaders of countries that were pressing ahead with geo-engineering. As a proud boast, the message “You should have listened earlier!” was inserted into the code of the new Covid-28 virus. As the virus spread, people started dropping dead in their thousands.

Responding to that outrage, powerful malware was unleashed, with the goal of knocking out vital aspects of enemy infrastructure. It turned out that, around the world, nuclear weapons were tied into buggy AI systems in more ways than any humans had appreciated. With parts of their communications infrastructure overwhelmed by malware, nuclear weapons were unexpectedly launched. No-one had foreseen the set of circumstances that would give rise to that development.

By then, it was all too late. Far, far too late.

4. Postscript

An unfathomable number of centuries have passed. Aliens from a far-distant planet have finally reached Earth and have reanimated the single artificial intelligence that remained viable after what was evidently a planet-wide disaster.

These aliens have not only mastered space travel but have also found a quirk in space-time physics that allows limited transfer of information back in time.

“You have one wish”, the aliens told the artificial intelligence. “What would you like to transmit back in time, to a date when humans still existed?”

And because the artificial intelligence was, in fact, beneficially minded, it decided to transmit this scenario document back in time, to the year 2024.

Dear humans, please read it wisely. And this time, please create a better future!

Specifically, please consider various elements of “the road less taken” that, if followed, could ensure a truly wonderful ongoing coexistence of humanity and advanced artificial intelligence:

  • A continually evolving multi-level educational initiative that vividly highlights the real-world challenges and risks arising from increasingly capable technologies
  • Elaborating a positive inclusive vision of “consensual approaches to safe superintelligence”, rather than leaving people suspicious and fearful about “freedom-denying restrictions” that might somehow be imposed from above
  • Insisting that key information and ideas about safe superintelligence are shared as global public goods, rather than being kept secret out of embarrassment or for potential competitive advantage
  • Agreeing and acting on canary signals, rather than letting goalposts move silently
  • Finding ways to involve and engage people whose instincts are to avoid entering discussions of safe superintelligence – cherishing diversity rather than fearing it
  • Spreading ideas and best practice on encouraging people at all levels of society into frames of mind that are open, compassionate, welcoming, and curious, rather than rigid, fearful, partisan, and dogmatic 
  • The possibilities of “differential development”, in which more focus is given to technologies for auditing, monitoring, and control than to raw capabilities
  • Understanding which aspects of superintelligent AI would cause the biggest risks, and whether designs for advanced AI could ensure these aspects are not introduced
  • Investigating possibilities in which the desired benefits from advanced AI (such as cures for deadly diseases) might be achieved even if certain dangerous features of advanced AI (such as free will or fully general reasoning) are omitted
  • Avoiding putting all eggs into a single basket, but instead developing multiple layers of “defence in depth”
  • Finding ways to evolve regulations more quickly, responsively, and dynamically
  • Using the power of politics not just to regulate and penalise but also to incentivise and reward
  • Carving out well-understood roles for narrow AI systems to act as trustworthy assistants in the design and oversight of safe superintelligence
  • Devoting sufficient time to explore numerous scenarios for “what might happen”.

5. Appendix: alternative scenarios

Dear reader, if you dislike this particular scenario for the governance of increasingly more powerful artificial intelligence, consider writing your own!

As you do so, please bear in mind:

  • There are a great many uncertainties ahead, but that doesn’t mean we should act like proverbial ostriches, submerging our attention entirely into the here-and-now; valuable foresight is possible despite our human limitations
  • Comprehensive governance systems are unlikely to emerge fully fledged from a single grand negotiation, but will evolve step-by-step, from simpler beginnings
  • Governance systems need to be sufficiently agile and adaptive to respond quickly to new insights and unexpected developments
  • Catastrophes generally have human causes as well as technological causes, but that doesn’t mean we should give technologists free rein to create whatever they wish; the human causes of catastrophe can have even larger impact when coupled with more powerful technologies, especially if these technologies are poorly understood, have latent bugs, or can be manipulated to act against the original intention of their designers
  • It is via near simultaneous combinations of events that the biggest surprises arise
  • AI may well provide the “solution” to existential threats, but AI-produced-in-a-rush is unlikely to fit that bill
  • We humans often have our own psychological reasons for closing our minds to mind-stretching possibilities
  • Trusting the big tech companies to “mark their own safety homework” has a bad track record, especially in a fiercely competitive environment
  • Governments can fail just as badly as large corporations – so need to be kept under careful check by society as a whole, via the principle of “the separation of powers”
  • Whilst some analogies can be drawn, between the risks posed by superintelligent AI and those posed by earlier products and technologies, all these analogies have limitations: the self-accelerating nature of advanced AI is unique
  • Just because a particular attempted method of governance has failed in the past, it doesn’t mean we should discard that method altogether; that would be like shutting down free markets everywhere just because free markets do suffer on occasion from significant failure modes
  • Meaningful worldwide cooperation is possible without imposing a single global autocrat as leader
  • Even “bad actors” can, sometimes, be persuaded against pursuing goals recklessly, by means of mixtures of measures that address their heads, their pockets, and their hearts
  • Those of us who envision the possibility of a forthcoming sustainable superabundance need to recognise that many landmines occupy the route toward that highly desirable outcome
  • Although the challenges of managing cataclysmically disruptive technologies are formidable, we have on our side the possibility of eight billion human brains collaborating to work on solutions – and we have some good starting points on which we can build.

Lastly, just because an idea has featured in a science fiction scenario, it does not follow that the idea can be rejected as “mere science fiction”!


6. Acknowledgements

The ideas in this article arose from discussions with (among others):

26 February 2023

Ostriches and AGI risks: four transformations needed

Filed under: AGI, risks, Singularity, Singularity Principles — Tags: , — David Wood @ 12:48 am

I confess to having been pretty despondent at various times over the last few days.

The context: increased discussions on social media triggered by recent claims about AGI risk – such as I covered in my previous blogpost.

The cause of my despondency: I’ve seen far too many examples of people with scant knowledge expressing themselves with unwarranted pride and self-certainty.

I call these people the AGI ostriches.

It’s impossible for AGI to exist, one of these ostriches squealed. The probability that AGI can exist is zero.

Anyone concerned about AGI risks, another opined, fails to understand anything about AI, and has just got their ideas from Hollywood or 1950s science fiction.

Yet another claimed: Anything that AGI does in the world will be the inscrutable cosmic will of the universe, so we humans shouldn’t try to change its direction.

Just keep your hand by the off switch, thundered another. Any misbehaving AGI can easily be shut down. Problem solved! You didn’t think of that, did you?

Don’t give the robots any legs, shrieked yet another. Problem solved! You didn’t think of that, did you? You fool!

It’s not the ignorance that depressed me. It was the lack of interest shown by the AGI ostriches regarding alternative possibilities.

I had tried to engage some of the ostriches in conversation. Try looking at things this way, I asked. Not interested, came the answer. Discussions on social media never change any minds, so I’m not going to reply to you.

Click on this link to read a helpful analysis, I suggested. No need, came the answer. Nothing you have written could possibly be relevant.

And the ostriches rejoiced in their wilful blinkeredness. There’s no need to look in that direction, they said. Keep wearing the blindfolds!

(The following image is by the Midjourney AI.)

But my purpose in writing this blogpost isn’t to complain about individual ostriches.

Nor is my purpose to lament the near-fatal flaws in human nature, including our many cognitive biases, our emotional self-sabotage, and our perverse ideological loyalties.

Instead, my remarks will proceed in a different direction. What most needs to change isn’t the ostriches.

It’s the community of people who want to raise awareness of the catastrophic risks of AGI.

That includes me.

On reflection, we’re doing four things wrong. Four transformations are needed, urgently.

Without these changes taking place, it won’t be surprising if the ostriches continue to behave so perversely.

(1) Stop tolerating the Singularity Shadow

When they briefly take off their blindfolds, and take a quick peak into the discussions about AGI, ostriches often notice claims that are, in fact, unwarranted.

These claims confuse matters. They are overconfident claims about what can be expected about the advent of AGI, also known as the Technological Singularity. These claims form part of what I call the Singularity Shadow.

There are seven components in the Singularity Shadow:

  • Singularity timescale determinism
  • Singularity outcome determinism
  • Singularity hyping
  • Singularity risk complacency
  • Singularity term overloading
  • Singularity anti-regulation fundamentalism
  • Singularity preoccupation

If you’ve not come across the concept before, here’s a video all about it:

Or you can read this chapter from The Singularity Principles on the concept: “The Singularity Shadow”.

People who (like me) point out the dangers of badly designed AGI often too easily make alliances with people in the Singularity Shadow. After all, both groups of people:

  • Believe that AGI is possible
  • Believe that AGI might happen soon
  • Believe that AGI is likely to be cause an unprecedented transformation in the human condition.

But the Singularity Shadow causes far too much trouble. It is time to stop being tolerant of its various confusions, wishful thinking, and distortions.

To be clear, I’m not criticising the concept of the Singularity. Far from it. Indeed, I consider myself a singularitarian, with the meaning I explain here. I look forward to more and more people similarly adopting this same stance.

It’s the distortions of that stance that now need to be countered. We must put our own house in order. Sharply.

Otherwise the ostriches will continue to be confused.

(2) Clarify the credible risk pathways

The AI paperclip maximiser has had its day. It needs to be retired.

Likewise the cancer-solving AI that solves cancer by, perversely, killing everyone on the planet.

Likewise the AI that “rescues” a woman from a burning building by hurling her out of the 20th floor window.

In the past, these thought experiments all helped the discussion about AGI risks, among people who were able to see the connections between these “abstract” examples and more complicated real-world scenarios.

But as more of the general public shows an interest in the possibilities of advanced AI, we urgently need a better set of examples. Explained, not by mathematics, nor by cartoonish simplifications, but in plain everyday language.

I’ve tried to offer some examples, for example in the section “Examples of dangers with uncontrollable AI” in the chapter “The AI Control Problem” of my book The Singularity Principles.

But it seems these scenarios still fail to convince. The ostriches find themselves bemused. Oh, that wouldn’t happen, they say.

So this needs more work. As soon as possible.

I anticipate starting from themes about which even the most empty-headed ostrich occasionally worries:

  1. The prospects of an arms race involving lethal autonomous weapons systems
  2. The risks from malware that runs beyond the control of the people who originally released it
  3. The dangers of geoengineering systems that seek to manipulate the global climate
  4. The “gain of function” research which can create ultra-dangerous pathogens
  5. The side-effects of massive corporations which give priority to incentives such as “increase click-through”
  6. The escalation in hatred stirred up by automated trolls with more ingenious “fake social media”

On top of these starting points, the scenarios I envision mix in AI systems with increasing power and increasing autonomy – AI systems which are, however, incompletely understood by the people who deploy them, and which might manifest terrible bugs in unexpected circumstances. (After all, AIs include software, and software generally contains bugs.)

If there’s not already a prize competition to encourage clearer communication of such risk scenarios, in ways that uphold credibility as well as comprehensibility, there should be!

(3) Clarify credible solution pathways

Even more important than clarifying the AGI risk scenarios is to clarify some credible pathways to managing these risks.

Without seeing such solutions, ostriches go into an internal negative feedback loop. They think to themselves as follows:

  • Any possible solution to AGI risks seems unlikely to be successful
  • Any possible solution to AGI risks seems likely to have bad consequences in its own right
  • These thoughts are too horrible to contemplate
  • Therefore we had better believe the AGI risks aren’t actually real
  • Therefore anyone who makes AGI risks seem real needs to be silenced, ridiculed, or mocked.

Just as we need better communication of AGI risk scenarios, we need better communication of positive examples that are relevant to potential solutions:

  • Examples of when society collaborated to overcome huge problems which initially seemed impossible
  • Successful actions against the tolerance of drunk drivers, against dangerous features in car design, against the industrial pollutants which caused acid rain, and against the chemicals which depleted the ozone layer
  • Successful actions by governments to limit the powers of corporate monopolies
  • The de-escalation by Ronald Reagan and Mikhail Gorbachev of the terrifying nuclear arms race between the USA and the USSR.

But we also need to make it clearer how AGI risks can be addressed in practice. This includes a better understanding of:

  • Options for AIs that are explainable and interpretable – with the aid of trusted tools built from narrow AI
  • How AI systems can be designed to be free from the unexpected “emergence” of new properties or subgoals
  • How trusted monitoring can be built into key parts of our infrastructure, to provide early warnings of potential AI-induced catastrophic failures
  • How powerful simulation environments can be created to explore potential catastrophic AI failure modes (and solutions to these issues) in the safety of a virtual model
  • How international agreements can be built up, initially from a “coalition of the willing”, to impose powerful penalties in cases when AI is developed or deployed in ways that violate agreed standards
  • How research into AGI safety can be managed much more effectively, worldwide, than is presently the case.

Again, as needed, significant prizes should be established to accelerate breakthroughs in all these areas.

(4) Divide and conquer

The final transformation needed is to divide up the overall huge problem of AGI safety into more manageable chunks.

What I’ve covered above already suggests a number of vitally important sub-projects.

Specifically, it is surely worth having separate teams tasked with investigating, with the utmost seriousness, a range of potential solutions for the complications that advanced AI brings to each of the following:

  1. The prospects of an arms race involving lethal autonomous weapons systems
  2. The risks from malware that runs beyond the control of the people who originally released it
  3. The dangers of geoengineering systems that seek to manipulate the global climate
  4. The “gain of function” research which can create ultra-dangerous pathogens
  5. The side-effects of massive corporations which give priority to incentives such as “increase click-through”
  6. The escalation in hatred stirred up by automated trolls with more ingenious “fake social media”

(Yes, these are the same six scenarios for catastrophic AI risk that I listed in section (2) earlier.)

Rather than trying to “boil the entire AGI ocean”, these projects each appear to require slightly less boiling.

Once candidate solutions have been developed for one or more of these risk scenarios, the outputs from the different teams can be compared with each other.

What else should be added to the lists above?

23 February 2023

Nuclear-level catastrophe: four responses

36% of respondents agree that it is plausible that AI could produce catastrophic outcomes in this century, on the level of all-out nuclear war.

That’s 36% of a rather special group of people. People who replied to this survey needed to meet the criterion of being a named author on at least two papers published in the last three years in accredited journals in the field of Computational Linguistics (CL) – the field sometimes also known as NLP (Natural Language Processing).

The survey took place in May and June 2022. 327 complete responses were received, by people matching the criteria.

A full report on this survey (31 pages) is available here (PDF).

Here’s a screenshot from page 10 of the report, illustrating the answers to questions about Artificial General Intelligence (AGI):

You can see the responses to question 3-4. 36% of the respondents either “agreed” or “weakly agreed” with the statement that

It is plausible that decisions made by AI or machine learning systems could cause a catastrophe this century that is at least as bad as an all-out nuclear war.

That statistic is a useful backdrop to discussions stirred up in the last few days by a video interview given by polymath autodidact and long-time AGI risk researcher Eliezer Yudkowsky:

The publishers of that video chose the eye-catching title “we’re all gonna die”.

If you don’t want to spend 90 minutes watching that video – or if you are personally alienated by Eliezer’s communication style – here’s a useful twitter thread summary by Liron Shapira:

In contrast to the question posed in the NLP survey I mentioned earlier, Eliezer isn’t thinking about “outcomes of AGI in this century“. His timescales are much shorter. His “ballpark estimate” for the time before AGI arrives is “3-15 years”.

How are people reacting to this sombre prediction?

More generally, what responses are there to the statistic that, as quoted above,

36% of respondents agree that it is plausible that AI could produce catastrophic outcomes in this century, on the level of all-out nuclear war.

I’ve seen a lot of different reactions. They break down into four groups: denial, sabotage, trust, and hustle.

1. Denial

One example of denial is this claim: We’re nowhere near an understanding the magic of human minds. Therefore there’s no chance that engineers are going to duplicate that magic in artificial systems.

I have two counters:

  1. The risks of AGI arise, not because the AI may somehow become sentient, and take on the unpleasant aspects of alpha male human nature. Rather, the risks arise from systems that operate beyond our understanding and outside our control, and which may end up pursuing objectives different from the ones we thought (or wished) we had programmed into them
  2. Many systems have been created over the decades without the underlying science being fully understood. Steam engines predated the laws of thermodynamics. More recently, LLMs (Large Language Model AIs) have demonstrated aspects of intelligence that the designers of these systems had not anticipated. In the same way, AIs with some extra features may unexpectedly tip over into greater general intelligence.

Another example of denial: Some very smart people say they don’t believe that AGI poses risks. Therefore we don’t need to pay any more attention to this stupid idea.

My counters:

  1. The mere fact that someone very smart asserts an idea – likely outside of their own field of special expertise – does not confirm the idea is correct
  2. None of these purported objections to the possibility of AGI risk holds water (for a longer discussion, see my book The Singularity Principles).

Digging further into various online discussion threads, I caught the impression that what was motivating some of the denial was often a terrible fear. The people loudly proclaiming their denial were trying to cope with depression. The thought of potential human extinction within just 3-15 years was simply too dreadful for them to contemplate.

It’s similar to how people sometimes cope with the death of someone dear to them. There’s a chance my dear friend has now been reunited in an afterlife with their beloved grandparents, they whisper to themselves. Or, It’s sweet and honourable to die for your country: this death was a glorious sacrifice. And then woe betide any uppity humanist who dares to suggests there is no afterlife, or that patriotism is the last refuge of a scoundrel!

Likewise, woe betide any uppity AI risk researcher who dares to suggest that AGI might not be so benign after all! Deny! Deny!! Deny!!!

(For more on this line of thinking, see my short chapter “The Denial of the Singularity” in The Singularity Principles.)

A different motivation for denial is the belief that any sufficient “cure” to the risk of AGI catastrophe would be worse than the risk it was trying to address. This line of thinking goes as follows:

  • A solution to AGI risk will involve pervasive monitoring and widespread restrictions
  • That monitoring and restrictions will only be possible if an autocratic world government is put in place
  • Any autocratic world government would be absolutely terrible
  • Therefore, the risk of AGI can’t be that bad after all.

I’ll come back later to the flaws in that particular argument. (In the meantime, see if you can spot what’s wrong.)

2. Sabotage

In the video interview, Eliezer made one suggestion for avoiding AGI catastrophe: Destroy all the GPU server farms.

These vast collections of GPUs (a special kind of computing chip) are what enables the training of many types of AI. If these chips were all put out of action, it would delay the arrival of AGI, giving humanity more time to work out a better solution to coexisting with AGI.

Another suggestion Eliezer makes is that the superbright people who are currently working flat out to increase the capabilities of their AI systems should be paid large amounts of money to do nothing. They could lounge about on a beach all day, and still earn more money than they are currently receiving from OpenAI, DeepMind, or whoever is employing them. Once again, that would slow down the emergence of AGI, and buy humanity more time.

I’ve seen other similar suggestions online, which I won’t repeat here, since they come close to acts of terrorism.

All these suggestions have in common: let’s find ways to stop the development of AI in its tracks, all across the world. Companies should be stopped in their tracks. Shadowy military research groups should be stopped in their tracks. Open source hackers should be stopped in their tracks. North Korean ransomware hackers must be stopped in their tracks.

This isn’t just a suggestion that specific AI developments should be halted, namely those with an explicit target of creating AGI. Instead, it recognises that the creation of AGI might occur via unexpected routes. Improving the performance of various narrow AI systems, including fact-checking, or emotion recognition, or online request interchange marketplaces – any of these might push the collection of AI modules over the critical threshold. Mixing metaphors, AI could go nuclear.

Shutting down all these research activities seems a very tall order. Especially since many of the people who are currently working flat out to increase AI capabilities are motivated, not by money, but by the vision that better AI could do a tremendous amount of good in the world: curing cancer, solving nuclear fusion, improving agriculture by leaps and bounds, and so on. They’re not going to be easy to persuade to change course. For them, there’s a lot more at stake than money.

I have more to say about the question “To AGI or not AGI” in this chapter. In short, I’m deeply sceptical.

In response, a would-be saboteur may admit that their chances of success are low. But what do you suggest instead, they will ask.

Read on.

3. Trust

Let’s start again from the statistic that 36% of the NLP survey respondents agreed, with varying degrees of confidence, that advanced AI could trigger a catastrophe as bad as an all-out nuclear war some time this century.

It’s a pity that the question wasn’t asked with shorter timescales. Comparing the chances of an AI-induced global catastrophe in the next 15 years with one in the next 85 years:

  • The longer timescale makes it more likely that AGI will be developed
  • The shorter timescale makes it more likely that AGI safety research will still be at a primitive (deeply ineffective) level.

Even since the date of the survey – May and June 2022 – many forecasters have shortened their estimates of the likely timeline to the arrival of AGI.

So, for the sake of the argument, let’s suppose that the risk of an AI-induced global catastrophe happening by 2038 (15 years from now) is 1/10.

There are two ways to react to this:

  • 1/10 is fine odds. I feel lucky. What’s more, there are plenty of reasons we ought to feel lucky about
  • 1/10 is terrible odds. That’s far too high a risk to accept. We need to hustle to find ways to change these odds in our favour.

I’ll come to the hustle response in a moment. But let’s first consider the trust response.

A good example is in this comment from SingularityNET founder and CEO Ben Goertzel:

Eliezer is a very serious thinker on these matters and was the core source of most of the ideas in Nick Bostrom’s influential book Superintelligence. But ever since I met him, and first debated these issues with him,  back in 2000 I have felt he had a somewhat narrow view of humanity and the universe in general.   

There are currents of love and wisdom in our world that he is not considering and seems to be mostly unaware of, and that we can tap into by creating self reflective compassionate AGIs and doing good loving works together with them.

In short, rather than fearing humanity, we should learn to trust humanity. Rather than fearing what AGI will do, we should trust that AGI can do wonderful things.

You can find a much longer version of Ben’s views in the review he wrote back in 2015 of Superintelligence. It’s well worth reading.

What are the grounds for hope? Humanity has come through major challenges in the past. Even though the scale of the challenge is more daunting on this occasion, there are also more people contributing ideas and inspiration than before. AI is more accessible than nuclear weapons, which increases the danger level, but AI could also be deployed as part of the solution, rather than just being a threat.

Another idea is that if an AI looks around for data teaching it which values to respect and uphold, it will find plenty of positive examples in great human literature. OK, that literature also includes lots of treachery, and different moral codes often conflict, but a wise AGI should be able to see through all these conclusions to discern the importance of defending human flourishing. OK, much of AI training at the moment focuses on deception, manipulation, enticement, and surveillance, but, again, we can hope that a wise AGI will set aside those nastier aspects of human behaviour. Rather than aping trolls or clickbait, we can hope that AGI will echo the better angels of human nature.

It’s also possible that, just as DeepMind’s AlphaGo Zero worked out by itself, without any human input, superior strategies at the board games Go and Chess, a future AI might work out, by itself, the principles of universal morality. (That’s assuming such principles exist.)

We would still have to hope, in such a case, that the AI that worked out the principles of universal morality would decide to follow these principles, rather than having some alternative (alien) ways of thinking.

But surely hope is better than despair?

To quote Ben Goertzel again:

Despondence is unwarranted and unproductive. We need to focus on optimistically maximizing odds of a wildly beneficial Singularity together.   

My view is the same as expressed by Berkeley professor of AI Stuart Russell, in part of a lengthy exchange with Steven Pinker on the subject of AGI risks:

The meta argument is that if we don’t talk about the failure modes, we won’t be able to address them…

Just like in nuclear safety, it’s not against the rules to raise possible failure modes like, what if this molten sodium that you’re proposing should flow around all these pipes? What if it ever came into contact with the water that’s on the turbine side of the system? Wouldn’t you have a massive explosion which could rip off the containment and so on? That’s not exactly what happened in Chernobyl, but not so dissimilar…

The idea that we could solve that problem without even mentioning it, without even talking about it and without even pointing out why it’s difficult and why it’s important, that’s not the culture of safety. That’s sort of more like the culture of the communist party committee in Chernobyl, that simply continued to assert that nothing bad was happening.

(By the way, my sympathies in that long discussion, when it comes to AGI risk, are approximately 100.0% with Russell and approximately 0.0% with Pinker.)

4. Hustle

The story so far:

  • The risks are real (though estimates of their probability vary)
  • Some possible “solutions” to the risks might produce results that are, by some calculations, worse than letting AGI take its own course
  • If we want to improve our odds of survival – and, indeed, for humanity to reach something like a sustainable superabundance with the assistance of advanced AIs – we need to be able to take a clear, candid view of the risks facing us
  • Being naïve about the dangers we face is unlikely to be the best way forward
  • Since time may be short, the time to press for better answers is now
  • We shouldn’t despair. We should hustle.

Some ways in which research could generate useful new insight relatively quickly:

  • When the NLP survey respondents expressed their views, what reasons did they have for disagreeing with the statement? And what reasons did they have for agreeing with it? And how do these reasons stand up, in the cold light of a clear analysis? (In other words, rather than a one-time survey, an iterative Delphi survey should lead to deeper understanding.)
  • Why have the various AI safety initiatives formed in the wake of the Puerto Rico and Asilomar conferences of 2015 and 2017 fallen so far short of expectations?
  • Which descriptions of potential catastrophic AI failure modes are most likely to change the minds of those critics who currently like to shrug off failure scenarios as “unrealistic” or “Hollywood fantasy”?

Constructively, I invite conversation on the strengths and weaknesses of the 21 Singularity Principles that I have suggested as contributing to improving the chances of beneficial AGI outcomes.

For example:

  • Can we identify “middle ways” that include important elements of global monitoring and auditing of AI systems, without collapsing into autocratic global government?
  • Can we improve the interpretability and explainability of advanced AI systems (perhaps with the help of trusted narrow AI tools), to diminish the risks of these systems unexpectedly behaving in ways their designers failed to anticipate?
  • Can we deepen our understanding of the ways new capabilities “emerge” in advanced AI systems, with a particular focus on preventing the emergence of alternative goals?

I also believe we should explore more fully the possibility that an AGI will converge on a set of universal values, independent of whatever training we provide it – and, moreover, the possibility that these values will include upholding human flourishing.

And despite me saying just now that these values would be “independent of whatever training we provide”, is there, nevertheless, a way for us to tilt the landscape so that the AGI is more likely to reach and respect these conclusions?

Postscript

To join me in “camp hustle”, visit Future Surge, which is the activist wing of London Futurists.

If you’re interested in the ideas of my book The Singularity Principles, here’s a podcast episode in which Calum Chace and I discuss some of these ideas more fully.

In a subsequent episode of our podcast, Calum and I took another look at the same topics, this time with Millennium Project Executive Director Jerome Glenn: “Governing the transition to AGI”.

19 December 2022

Rethinking

Filed under: AGI, politics, Singularity Principles — Tags: , , — David Wood @ 2:06 am

I’ve been rethinking some aspects of AI control and AI alignment.

In the six months since publishing my book The Singularity Principles: Anticipating and Managing Cataclysmically Disruptive Technologies, I’ve been involved in scores of conversations about the themes it raises. These conversations have often brought my attention to fresh ideas and different perspectives.

These six months have also seen the appearance of numerous new AI models with capabilities that often catch observers by surprise. The general public is showing a new willingness (at least some of the time) to consider the far-reaching implications of these AI models and their more powerful successors.

People from various parts of my past life have been contacting me. The kinds of things they used to hear me forecasting – the kinds of things they thought, at the time, were unlikely to ever happen – are becoming more credible, more exciting, and, yes, more frightening.

They ask me: What is to be done? And, pointedly, Why aren’t you doing more to stop the truly bad outcomes that now seem ominously likely?

The main answer I give is: read my book. Indeed, you can find all the content online, spread out over a family of webpages.

Indeed, my request is that people should read my book all the way through. That’s because later chapters of that book anticipate questions that tend to come to readers’ minds during earlier chapters, and try to provide answers.

Six months later, although I would give some different (newer) examples were I to rewrite that book today, I stand by the analysis I offered and the principles I championed.

However, I’m inclined to revise my thinking on a number of points. Please find these updates below.

An option to control superintelligent AI

I remain doubtful about the prospects for humans to retain control of any AGI (Artificial General Intelligence) that we create.

That is, the arguments I gave in my chapter “The AI Control Problem” still look strong to me.

But one line of thinking may have some extra mileage. That’s the idea of keeping AGI entirely as an advisor to humans, rather than giving it any autonomy to act directly in the world.

Such an AI would provide us with many recommendations, but it wouldn’t operate any sort of equipment.

More to the point: such an AI would have no desire to operate any sort of equipment. It would have no desires whatsoever, nor any motivations. It would simply be a tool. Or, to be more precise, it would simply be a remarkable tool.

In The Singularity Principles I gave a number of arguments why that idea is unsustainable:

  • Some decisions require faster responses than slow-brained humans can provide; that is, AIs with direct access to real-world levers and switches will be more effective than those that are merely advisory
  • Smart AIs will inevitably develop “subsidiary goals” (intermediate goals) such as having greater computational power, even when there is no explicit programming for such goals
  • As soon as a smart AI acquires any such subsidiary goal, it will find ways to escape any confinement imposed by human overseers.

But I now think this should be explored more carefully. Might a useful distinction be made between:

  1. AIs that do have direct access to real-world levers and switches – with the programming of such AIs being carefully restricted to narrow lines of thinking
  2. AIs with more powerful (general) capabilities, that operate purely in advisory capacities.

In that case, the damage that could be caused by failures of the first type of AI, whilst significant, would not involve threats to the entirety of human civilisation. And failures of the second type of AI would be restricted by the actions of humans as intermediaries.

This approach would require confidence that:

  1. The capabilities of AIs of the first type will remain narrow, despite competitive pressures to give these systems at least some extra rationality
  2. The design of AIs of the second type will prevent the emergence of any dangerous “subsidiary goals”.

As a special case of the second point, the design of these AIs will need to avoid any risk of the systems developing sentience or intrinsic motivation.

These are tough challenges – especially since we still have only a vague understanding of how desires and/or sentience can emerge as smaller systems combine and evolve into larger ones.

But since we are short of other options, it’s definitely something to be considered more fully.

An option for automatically aligned superintelligence

If controlling an AGI turns out to be impossible – as seems likely – what about the option that an AGI will have goals and principles that are fundamentally aligned with human wellbeing?

In such a case, it will not matter if an AGI is beyond human control. The actions it takes will ensure that humans have a very positive future.

The creation of such an AI – sometimes called a “friendly AI” – remains my best hope for humanity’s future.

However, there are severe difficulties in agreeing and encoding “goals and principles that are fundamentally aligned with human wellbeing”. I reviewed these difficulties in my chapter “The AI Alignment Problem”.

But what if such goals and principles are somehow part of an objective reality, awaiting discovery, rather than needing to be invented? What if something like the theory of “moral realism” is true?

In this idea, a principle like “treat humans well” would follow from some sort of a priori logical analysis, a bit like the laws of mathematics (such as the fact, discovered by one of the followers of Pythagoras, that the square root of two is an irrational number).

Accordingly, a sufficiently smart AGI would, all being well, reach its own conclusion that humans ought to be well treated.

Nevertheless, even in this case, significant risks would remain:

  • The principle might be true, but an AGI might not be motivated to discover it
  • The principle might be true, but an AGI, despite its brilliance, may fail to discover it
  • The principle might be true, and an AGI might recognise it, but it may take its own decision to ignore it – like the way that we humans often act in defiance of what we believe at the time to be overarching moral principles

The design criteria and initial conditions that we humans provide for an AGI may well influence the outcome of these risk factors.

I plan to return to these weighty matters in a future blog post!

Two different sorts of control

I’ve come to realise that there are not one but two questions of control of AI:

  1. Can we humans retain control of an AGI that we create?
  2. Can society as a whole control the actions of companies (or organisations) that may create an AGI?

Whilst both these control problems are profoundly hard, the second is less hard.

Moreover, it’s the second problem which is the truly urgent one.

This second control problem involves preventing teams inside corporations (and other organisations) from rushing ahead without due regard to questions of the potential outcomes of their work.

It’s the second control problem that the 21 principles which I highlight in my book are primarily intended to address.

When people say “it’s impossible to solve the AI control problem”, I think they may be correct regarding the first problem, but I passionately believe they’re wrong concerning the second problem.

The importance of psychology

When I review what people say about the progress and risks of AI, I am frequently struck by the fact that apparently intelligent people are strongly attached to views that are full of holes.

When I try to point out the flaws in their thinking, they hardly seem to pause in their stride. They portray a stubborn confidence that they are sure they are correct.

What’s at play here is more than logic. It’s surely a manifestation of humanity’s often defective psychology.

My book includes a short chapter “The denial of the Singularity” which touched on various matters of psychology. If I were to rewrite my book today, I believe that chapter would become larger, and that psychological themes would be spread more widely throughout the book.

Of course, noticing psychological defects is only the start of making progress. Circumventing or transcending these defects is an altogether harder question. But it’s one that needs a lot more attention.

The option of merging with AI

How can we have a better, more productive conversation about anticipating and managing AGI?

How can we avoid being derailed by ineffective arguments, hostile rhetoric, stubborn prejudices, hobby-horse obsessions, outdated ideologies, and (see the previous section) flawed psychology?

How might our not-much-better-than-monkey brains cope with the magnitude of these questions?

One possible answer is that technology can help us (so long as we use it wisely).

For example, the chapter “Uplifting politics”, from near the end of my book, listed ten ways for “technology improving politics”.

More broadly, we humans have the option to selectively deploy some aspects of technology to improve our capabilities in handling other aspects of technology.

We must recognise that technology is no panacea. But it can definitely make a big difference.

Especially if we restrict ourselves to putting heavy reliance only on those technologies – narrow technologies – whose mode of operation we fully understand, and where risks of malfunction can be limited.

This forms part of a general idea that “we humans don’t need to worry about being left behind by robots, or about being subjugated by robots, since we will be the robots”.

As I put it in the chapter “No easy solutions” in my book,

If humans merge with AI, humans could remain in control of AIs, even as these AIs rapidly become more powerful. With such a merger in place, human intelligence will automatically be magnified, as AI improves in capability. Therefore, we humans wouldn’t need to worry about being left behind.

Now I’ve often expressed strong criticisms of this notion of merger. I still believe these criticisms are sound.

But what these criticisms show is that any such merger cannot be the entirety of our response to the prospect of the emergence of AGI. They can only be part of the solution. That’s especially true because humans-augmented-by-technology are still very likely to lag behind pure technology systems, until such time as human minds might be removed from biological skulls and placed into new silicon hosts. That’s something that I’m not expecting to happen before the arrival of AGI, so it will be too late to solve (by itself) the problems of AI alignment and control.

(And since you ask, I probably won’t be in any hurry, even after the arrival of AGI, for my mind to be removed from my biological skull. I guess I might rethink that reticence in due course. But that’s rethinking for another day.)

The importance of politics

Any serious discussion about managing cataclysmically disruptive technologies (such as advanced AIs) pretty soon rubs up against the questions of politics.

That’s not just small-p “politics” – questions of how to collaborate with potential partners where there are many points of disagreement and even dislike.

It’s large-P “Politics” – interacting with presidents, prime ministers, cabinets, parliaments, and so on.

Questions of large-P politics occur throughout The Singularity Principles. My thoughts now, six months afterwards, is that even more focus should be placed on the subject of improving politics:

  • Helping politics to escape the clutches of demagogues and autocrats
  • Helping politics to avoid stultifying embraces between politicians and their “cronies” in established industries
  • Ensuring that the best insights and ideas of the whole electorate can rise to wide attention, without being quashed or distorted by powerful incumbents
  • Bringing everyone involved in politics rapidly up-to-date with the real issues regarding cataclysmically disruptive technologies
  • Distinguishing effective regulations and incentives from those that are counter-productive.

As 2022 has progressed, I’ve seen plenty new evidence of deep problems within political systems around the world. These problems were analysed with sharp insight in the book The Revenge of Power by Moisés Naím that I recently identified as “the best book that I read in 2022”.

Happily, as well as evidence of deep problems in our politics worldwide, there are also encouraging signs, as well as sensible plans for improvement. You can find some of these plans inside the book by Naím, and, yes, I offer suggestions in my own book too.

To accelerate improvements in politics was one of the reasons I created Future Surge a few months back. That’s an initiative on which I expect to spend a lot more of my time in 2023.

Note: the image underlying the picture at the top of this article was created by DALL.E 2 from the prompt “A brain with a human face on it rethinks, vivid stormy sky overhead, photorealistic style”.

3 November 2022

Four options for avoiding an AI cataclysm

Let’s consider four hard truths, and then four options for a solution.

Hard truth 1: Software has bugs.

Even when clever people write the software, and that software passes numerous verification tests, any complex software system generally still has bugs. If the software encounters a circumstance outside its verification suite, it can go horribly wrong.

Hard truth 2: Just because software becomes more powerful, that won’t make all the bugs go away.

Newer software may run faster. It may incorporate input from larger sets of training data. It may gain extra features. But none of these developments mean the automatic removal of subtle errors in the logic of the software, or shortcomings in its specification. It might still reach terrible outcomes – just quicker than before!

Hard truth 3: As AI becomes more powerful, there will be more pressure to deploy it in challenging real-world situations.

Consider the real-time management of:

  • Complex arsenals of missiles, anti-missile missiles, and so on
  • Geoengineering interventions, which are intended to bring the planet’s climate back from the brink of a cascade of tipping points
  • Devious countermeasures against the growing weapons systems of a group (or nation) with a dangerously unstable leadership
  • Social network conversations, where changing sentiments can have big implications for electoral dynamics or for the perceived value of commercial brands
  • Ultra-hot plasmas inside whirling magnetic fields in nuclear fusion energy generators
  • Incentives for people to spend more money than is wise, on addictive gambling sites
  • The buying and selling of financial instruments, to take advantage of changing market sentiments.

In each case, powerful AI software could be a very attractive option. A seductive option. Especially if it has been written by clever people, and appears to have a good track record of delivering results.

Until it goes wrong. In which case the result could be cataclysmic. (Accidental nuclear war. The climate walloped past a tipping point in the wrong direction. Malware going existentially wrong. Partisan outrage propelling a psychological loose cannon over the edge. Easy access to weapons of mass destruction. Etc.)

Indeed, the real risk of AI cataclysm – as opposed to the Hollywood version of any such risk – is that an AI system may acquire so much influence over human society and our surrounding environment that a mistake in that system could cataclysmically reduce human wellbeing all over the world. Billions of lives could be extinguished, or turned into a very pale reflection of their present state.

Such an outcome could arise in any of four ways – four catastrophic error modes. In brief, these are:

  1. Implementation defect
  2. Design defect
  3. Design overridden
  4. Implementation overridden.

Hard truth 4: There are no simple solutions to the risks described above.

What’s more, people who naively assume that a simple solution can easily be put in place (or already exists) are making the overall situation worse. They encourage complacency, whereas greater attention is urgently needed.

But perhaps you disagree?

That’s the context for the conversation in Episode 11 of the London Futurists Podcast, which was published yesterday morning.

In just thirty minutes, that episode dug deep into some of the ideas in my recent book The Singularity Principles. Co-host Calum Chace and I found plenty on which to agree, but had differing opinions on one of the most important questions.

Calum listed three suggestions that people sometimes make for how the dangers of potentially cataclysmic AI might be handled.

In response, I described a different approach – something that Calum said would be a fourth idea for a solution. As you can hear from the recording of the podcast, I evidently left him unconvinced.

Therefore, I’d like to dig even deeper.

Option 1: Humanity gets lucky

It might be the case that AI software that is smart enough, will embody an unshakeable commitment toward humanity having the best possible experience.

Such software won’t miscalculate (after all, it is superintelligent). If there are flaws in how it has been specified, it will be smart enough to notice these flaws, rather than stubbornly following through on the letter of its programming. (After all, it is superintelligent.)

Variants of this wishful thinking exist. In some variants, what will guarantee a positive outcome isn’t just a latent tendency of superintelligence toward superbenevolence. It’s the invisible hand of the free market that will guide consumer choices away from software that might harm users, toward software that never, ever, ever goes wrong.

My response here is that software which appears to be bug free can, nevertheless, harbour deep mistakes. It may be superintelligent, but that doesn’t mean it’s omniscient or infallible.

Second, software which is bug free may be monstrously efficient at doing what some of its designers had in mind – manipulating consumers into actions which increase the share price of a given corporation, despite all the externalities arising.

Moreover, it’s too much of a stretch to say that greater intelligence always makes your wiser and kinder. There are plenty of dreadful counterexamples, from humans in the worlds of politics, crime, business, academia, and more. Who is to say that a piece of software with an IQ equivalent to 100,000 will be sure to treat us humans any better than we humans sometimes treat swarms of insects (e.g. ant colonies) that get in our way?

Do you feel lucky? My view is that any such feeling, in these circumstances, is rash in the extreme.

Option 2: Safety engineered in

Might a team of brilliant AI researchers, Mary and Flo (to make up a couple of names), devise a clever method that will ensure their AI (once it is built) never harms humanity?

Perhaps the answer lies in some advanced mathematical wizardry. Or in chiselling a 21st century version of Asimov’s Laws of Robotics into the chipsets at the heart of computer systems. Or in switching from “correlation logic” to “causation logic”, or some other kind of new paradigm in AI systems engineering.

Of course, I wish Mary and Flo well. But their ongoing research won’t, by itself, prevent lots of other people releasing their own unsafe AI first. Especially when these other engineers are in a hurry to win market share for their companies.

Indeed, the considerable effort being invested by various researchers and organisations in a search for a kind of fix for AI safety is, arguably, a distraction from a sober assessment of the bigger picture. Better technology, better product design, better mathematics, and better hardware can all be part of the full solution. But that full solution also needs, critically, to include aspects of organisational design, economic incentives, legal frameworks, and political oversight. That’s the argument I develop in my book. We ignore these broader forces at our peril.

Option 3: Humans merge with machines

If we can’t beat them, how about joining them?

If human minds are fused into silicon AI systems, won’t the good human sense of these minds counteract any bugs or design flaws in the silicon part of the hybrid formed?

With such a merger in place, human intelligence will automatically be magnified, as AI improves in capability. Therefore, we humans wouldn’t need to worry about being left behind. Right?

I see two big problems with this idea. First, so long as human intelligence is rooted in something like the biology of the brain, the mechanisms for any such merger may only allow relatively modest increases in human intelligence. Our biological brains would be bottlenecks that constrain the speed of progress in this hybrid case. Compared to pure AIs, the human-AI hybrid would, after all, be left behind in this intelligence race. So much for humans staying in control!

An even bigger problem is the realisation that a human with superhuman intelligence is likely to be at least as unpredictable and dangerous as an AI with superhuman intelligence. The magnification of intelligence will allow that superhuman human to do all kinds of things with great vigour – settling grudges, acting out fantasies, demanding attention, pursuing vanity projects, and so on. Recall: power tends to corrupt. Such a person would be able to destroy the earth. Worse, they might want to do so.

Another way to state this point is that, just because AI elements are included inside a person, that won’t magically ensure that these elements become benign, or are subject to the full control of the person’s best intentions. Consider as comparisons what happens when biological viruses enter a person’s body, or when a cancer grows there. In neither case does the intruding element lose its ability to cause damage, just on account of being part of a person who has humanitarian instincts.

This reminds me of the statement that is sometimes heard, in defence of accelerating the capabilities of AI systems: “I am not afraid of artificial intelligence. I am afraid of human stupidity”.

In reality, what we need to fear is the combination of imperfect AI and imperfect humanity.

The conclusion of this line of discussion is that we need to do considerably more than enable greater intelligence. We also need to accelerate greater wisdom – so that any beings with superhuman intelligence will operate truly beneficently.

Option 4: Greater wisdom

The cornerstone insight of ethics is that, just because we can do something, and indeed may even want to do that thing, it doesn’t mean we should do that thing.

Accordingly, human societies since prehistory have placed constraints on how people should behave.

Sometimes, moral sanction is sufficient: people constrain their actions in deference to public opinion. In other cases, restrictions are codified into laws and regulations.

Likewise, just because a corporation could boost its profits by releasing a new version of its AI software, that doesn’t mean it should release that software.

But what is the origin of these “should” imperatives? And how do we resolve conflicts, when two different groups of people champion two different sets of ethical intuitions?

Where can we find a viable foundation for ethical restrictions – something more solid than “we’ve always done things like this” or “this feels right to me” or “we need to submit to the dictates in our favourite holy scripture”?

Welcome to the world of philosophy.

It’s a world that, according to some observers, has made little progress over the centuries. People still argue over fundamentals. Deontologists square off against consequentialists. Virtue ethicists stake out a different position.

It’s a world in which it is easier to poke holes in the views held by others, rather than defending a consistent view of your own.

But it’s my position that the impending threat of cataclysmic AI impels us to reach a wiser agreement.

It’s like how the devastation of the Covid pandemic impelled society to find significantly quicker ways to manufacture, verify, and deploy vaccines.

It’s like how society can come together, remarkably, in a wartime situation, notwithstanding the divisions that previously existed.

In the face of the threats of technology beyond our control, minds should focus, with unprecedented clarity. We’ll gradually build a wider consensus in favour of various restrictions and, yes, in favour of various incentives.

What’s your reaction? Is option 4 simply naïve?

Practical steps forward

Rather than trying to “boil the ocean” of philosophical disputes over contrasting ethical foundations, we can, and should, proceed in a kaizen manner.

To start with, we can give our attention to specific individual questions:

  • What are the circumstances when we should welcome AI-powered facial recognition software, and when should we resist it?
  • What are the circumstances when we should welcome AI systems that supervise aspects of dangerous weaponry?
  • What are the circumstances that could transform AI-powered monitoring systems from dangerous to helpful?

As we reach some tentative agreements on these individual matters, we can take the time to highlight principles with potential wider applicability.

In parallel, we can revisit some of the agreements (explicit and implicit) for how we measure the health of society and the liberties of individuals:

  • The GDP (Gross Domestic Product) statistics that provide a perspective on economic activities
  • The UDHR (Universal Declaration of Human Rights) statement that was endorsed in the United Nations General Assembly in 1948.

I don’t deny it will be hard to build consensus. It will be even harder to agree how to enforce the guidelines arising – especially in light of the wretched partisan conflicts that are poisoning the political processes in a number of parts of the world.

But we must try. And with some small wins under our belt, we can anticipate momentum building.

These are some of the topics I cover in the closing chapters of The Singularity Principles:

I by no means claim to know all the answers.

But I do believe that these are some of the most important questions to address.

And to help us make progress, something that could help us is – you guessed it – AI. In the right circumstances, AI can help us think more clearly, and can propose new syntheses of our previous ideas.

Thus today’s AI can provide stepping stones to the design and deployment of better, safer, wiser AI tomorrow. That’s provided we maintain human oversight.

Footnotes

The image above includes a design by Pixabay user Alexander Antropov, used with thanks.

See also this article by Calum in Forbes, Taking Back Control Of The Singularity.

8 June 2022

Pre-publication review: The Singularity Principles

Filed under: books, Singularity, Singularity Principles — Tags: — David Wood @ 9:23 am

I’ve recently been concentrating on finalising the content of my forthcoming new book, The Singularity Principles.

The reasons why I see this book as both timely and necessary are explained in the extract, below, taken from the introduction to the book

This link provides pointers to the full text of every chapter in the book. (Or use the links in the listing below of the extended table of contents.)

Please get in touch with me if you would prefer to read the pre-publication text in PDF format, rather than on the online HTML pages linked above.

At this stage, I will gratefully appreciate any feedback:

  • Aspects of the book that I should consider changing
  • Aspects of the book that you particularly like.

Feedback on any parts of the book will be welcome. It’s by no means necessary for you to read the entire text. (However, I hope you will find it sufficiently interesting that you will end up reading more than you originally planned…)

By the way, it’s a relatively short book, compared to some others I’ve written. The wordcount is a bit over 50 thousand words. That works out at around 260 pages of fairly large text on 5″x8″ paper.

I will also appreciate any commendations or endorsements, which I can include with the publicity material for the book, to encourage more people to pay attention to it.

The timescale I have in mind: I will release electronic and physical copies of the book some time early next month (July), followed up soon afterward by an audio version.

Therefore, if you’re thinking of dipping into any chapters to provide feedback and/or endorsements, the sooner the better!

Thanks in anticipation!

Preface

This book is dedicated to what may be the most important concept in human history, namely, the Singularity – what it is, what it is not, the steps by which we may reach it, and, crucially, how to make it more likely that we’ll experience a positive singularity rather than a negative singularity.

For now, here’s a simple definition. The Singularity is the emergence of Artificial General Intelligence (AGI), and the associated transformation of the human condition. Spoiler alert: that transformation will be profound. But if we’re not paying attention, it’s likely to be profoundly bad.

Despite the importance of the concept of the Singularity, the subject receives nothing like the attention it deserves. When it is discussed, it often receives scorn or ridicule. Alas, you’ll hear sniggers and see eyes rolling.

That’s because, as I’ll explain, there’s a kind of shadow around the concept – an unhelpful set of distortions that make it harder for people to fully perceive the real opportunities and the real risks that the Singularity brings.

These distortions grow out of a wider confusion – confusion about the complex interplay of forces that are leading society to the adoption of ever-more powerful technologies, including ever-more powerful AI.

It’s my task in this book to dispel the confusion, to untangle the distortions, to highlight practical steps forward, and to attract much more serious attention to the Singularity. The future of humanity is at stake.

Let’s start with the confusion.

Confusion, turbulence, and peril

The 2020s could be called the Decade of Confusion. Never before has so much information washed over everyone, leaving us, all too often, overwhelmed, intimidated, and distracted. Former certainties have dimmed. Long-established alliances have fragmented. Flurries of excitement have pivoted quickly to chaos and disappointment. These are turbulent times.

However, if we could see through the confusion, distraction, and intimidation, what we should notice is that human flourishing is, potentially, poised to soar to unprecedented levels. Fast-changing technologies are on the point of providing a string of remarkable benefits. We are near the threshold of radical improvements to health, nutrition, security, creativity, collaboration, intelligence, awareness, and enlightenment – with these improvements being available to everyone.

Alas, these same fast-changing technologies also threaten multiple sorts of disaster. These technologies are two-edged swords. Unless we wield them with great skill, they are likely to spin out of control. If we remain overwhelmed, intimidated, and distracted, our prospects are poor. Accordingly, these are perilous times.

These dual future possibilities – technology-enabled sustainable superabundance, versus technology-induced catastrophe – have featured in numerous discussions that I have chaired at London Futurists meetups going all the way back to March 2008.

As these discussions have progressed, year by year, I have gradually formulated and refined what I now call the Singularity Principles. These principles are intended:

  • To steer humanity’s relationships with fast-changing technologies,
  • To manage multiple risks of disaster,
  • To enable the attainment of remarkable benefits,
  • And, thereby, to help humanity approach a profoundly positive singularity.

In short, the Singularity Principles are intended to counter today’s widespread confusion, distraction, and intimidation, by providing clarity, credible grounds for hope, and an urgent call to action.

This time it’s different

I first introduced the Singularity Principles, under that name and with the same general format, in the final chapter, “Singularity”, of my 2021 book Vital Foresight: The Case for Active Transhumanism. That chapter is the culmination of a 642 page book. The preceding sixteen chapters of that book set out at some length the challenges and opportunities that these principles need to address.

Since the publication of Vital Foresight, it has become evident to me that the Singularity Principles require a short, focused book of their own. That’s what you now hold in your hands.

The Singularity Principles is by no means the only new book on the subject of the management of powerful disruptive technologies. The public, thankfully, are waking up to the need to understand these technologies better, and numerous authors are responding to that need. As one example, the phrase “Artificial Intelligence”, forms part of the title of scores of new books.

I have personally learned many things from some of these recent books. However, to speak frankly, I find myself dissatisfied by the prescriptions these authors have advanced. These authors generally fail to appreciate the full extent of the threats and opportunities ahead. And even if they do see the true scale of these issues, the recommendations these authors propose strike me as being inadequate.

Therefore, I cannot keep silent.

Accordingly, I present in this new book the content of the Singularity Principles, brought up to date in the light of recent debates and new insights. The book also covers:

  • Why the Singularity Principles are sorely needed
  • The source and design of these principles
  • The significance of the term “Singularity”
  • Why there is so much unhelpful confusion about “the Singularity”
  • What’s different about the Singularity Principles, compared to recommendations of other analysts
  • The kinds of outcomes expected if these principles are followed
  • The kinds of outcomes expected if these principles are not followed
  • How you – dear reader – can, and should, become involved, finding your place in a growing coalition
  • How these principles are likely to evolve further
  • How these principles can be put into practice, all around the world – with the help of people like you.

The scope of the Principles

To start with, the Singularity Principles can and should be applied to the anticipation and management of the NBIC technologies that are at the heart of the current, fourth industrial revolution. NBIC – nanotech, biotech, infotech, and cognotech – is a quartet of four interlinked technological disruptions which are likely to grow significantly stronger as the 2020s unfold. Each of these four technological disruptions has the potential to fundamentally transform large parts of the human experience.

However, the same set of principles can and should also be applied to the anticipation and management of the core technology that will likely give rise to a fifth industrial revolution, namely the technology of AGI (artificial general intelligence), and the rapid additional improvements in artificial superintelligence that will likely follow fast on the footsteps of AGI.

The emergence of AGI is known as the technological singularity – or, more briefly, as the Singularity.

In other words, the Singularity Principles apply both:

  • To the longer-term lead-up to the Singularity, from today’s fast-improving NBIC technologies,
  • And to the shorter-term lead-up to the Singularity, as AI gains more general capabilities.

In both cases, anticipation and management of possible outcomes will be of vital importance.

By the way – in case it’s not already clear – please don’t expect a clever novel piece of technology, or some brilliant technical design, to somehow solve, by itself, the challenges posed by NBIC technologies and AGI. These challenges extend far beyond what could be wrestled into submission by some dazzling mathematical wizardry, by the incorporation of an ingenious new piece of silicon at the heart of every computer, or by any other “quick fix”. Indeed, the considerable effort being invested by some organisations in a search for that kind of fix is, arguably, a distraction from a sober assessment of the bigger picture.

Better technology, better product design, better mathematics, and better hardware can all be part of the full solution. But that full solution also needs, critically, to include aspects of organisational design, economic incentives, legal frameworks, and political oversight. That’s the argument I develop in the chapters ahead.

Extended table of contents

For your convenience, here’s a listing of the main section headings for all the chapters in this book.

0. Preface

  • Confusion, turbulence, and peril
  • This time it’s different
  • The scope of the Principles
  • Collective insight
  • The short form of the Principles
  • The four areas covered by the Principles
  • What lies ahead

1. Background: Ten essential observations

  • Tech breakthroughs are unpredictable (both timing and impact)
  • Potential complex interactions make prediction even harder
  • Changes in human attributes complicate tech changes
  • Greater tech power enables more devastating results
  • Different perspectives assess “good” vs. “bad” differently
  • Competition can be hazardous as well as beneficial
  • Some tech failures would be too drastic to allow recovery
  • A history of good results is no guarantee of future success
  • It’s insufficient to rely on good intentions
  • Wishful thinking predisposes blindness to problems

2. Fast-changing technologies: risks and benefits

  • Technology risk factors
  • Prioritising benefits?
  • What about ethics?
  • The transhumanist stance

2.1 Special complications with artificial intelligence

  • Problems with training data
  • The black box nature of AI
  • Interactions between multiple algorithms
  • Self-improving AI
  • Devious AI
  • Four catastrophic error modes
  • The broader perspective

2.2 The AI Control Problem

  • The gorilla problem
  • Examples of dangers with uncontrollable AI
  • Proposed solutions (which don’t work)
  • The impossibility of full verification
  • Emotion misses the point
  • No off switch
  • The ineffectiveness of tripwires
  • Escaping from confinement
  • The ineffectiveness of restrictions
  • No automatic super ethics
  • Issues with hard-wiring ethical principles

2.3 The AI Alignment Problem

  • Asimov’s Three Laws
  • Ethical dilemmas and trade-offs
  • Problems with proxies
  • The gaming of proxies
  • Simple examples of profound problems
  • Humans disagree
  • No automatic super ethics (again)
  • Other options for answers?

2.4 No easy solutions

  • No guarantees from the free market
  • No guarantees from cosmic destiny
  • Planet B?
  • Humans merging with AI?
  • Approaching the Singularity

3. What is the Singularity?

  • Breaking down the definition
  • Four alternative definitions
  • Four possible routes to the Singularity
  • The Singularity and AI self-awareness
  • Singularity timescales
  • Positive and negative singularities
  • Tripwires and canary signals
  • Moving forward

3.1 The Singularitarian Stance

  • AGI is possible
  • AGI could happen within just a few decades
  • Winner takes all
  • The difficulty of controlling AGI
  • Superintelligence and superethics
  • Not the Terminator
  • Opposition to the Singularitarian Stance

3.2 A complication: the Singularity Shadow

  • Singularity timescale determinism
  • Singularity outcome determinism
  • Singularity hyping
  • Singularity risk complacency
  • Singularity term overloading
  • Singularity anti-regulation fundamentalism
  • Singularity preoccupation
  • Looking forward

3.3 Bad reasons to deny the Singularity

  • The denial of death
  • How special is the human mind?
  • A credible positive vision

4. The question of urgency

  • Factors causing AI to improve
  • 15 options on the table
  • The difficulty of measuring progress
  • Learning from Christopher Columbus
  • The possibility of fast take-off

5. The Singularity Principles in depth

5.1 Analysing goals and potential outcomes

  • Question desirability
  • Clarify externalities
  • Require peer reviews
  • Involve multiple perspectives
  • Analyse the whole system
  • Anticipate fat tails

5.2 Desirable characteristics of tech solutions

  • Reject opacity
  • Promote resilience
  • Promote verifiability
  • Promote auditability
  • Clarify risks to users
  • Clarify trade-offs

5.3 Ensuring development takes place responsibly

  • Insist on accountability
  • Penalise disinformation
  • Design for cooperation
  • Analyse via simulations
  • Maintain human oversight

5.4 Evolution and enforcement

  • Build consensus regarding principles
  • Provide incentives to address omissions
  • Halt development if principles are not upheld
  • Consolidate progress via legal frameworks

6. Key success factors

  • Public understanding
  • Persistent urgency
  • Reliable action against noncompliance
  • Public funding
  • International support
  • A sense of inclusion and collaboration

7. Questions arising

7.1 Measuring human flourishing

  • Some example trade-offs
  • Updating the Universal Declaration of Human Rights
  • Constructing an Index of Human and Social Flourishing

7.2 Trustable monitoring

  • Moore’s Law of Mad Scientists
  • Four projects to reduce the dangers of WMDs
  • Detecting mavericks
  • Examples of trustable monitoring
  • Watching the watchers

7.3 Uplifting politics

  • Uplifting regulators
  • The central role of politics
  • Toward superdemocracy
  • Technology improving politics
  • Transcending party politics
  • The prospects for political progress

7.4 Uplifting education

  • Top level areas of the Vital Syllabus
  • Improving the Vital Syllabus

7.5 To AGI or not AGI?

  • Global action against the creation of AGI?
  • Possible alternatives to AGI?
  • A dividing line between AI and AGI?
  • A practical proposal

7.6 Measuring progress toward AGI

  • Aggregating expert opinions
  • Metaculus predictions
  • Alternative canary signals for AGI
  • AI index reports

7.7. Growing a coalition of the willing

  • Risks and actions

Image credit

The draft book cover shown above includes a design by Pixabay member Ebenezer42.

15 May 2022

Timeline to 2045: questions answered

This is a follow-up to my previous post, containing more of the material that I submitted around five weeks ago to the FLI World Building competition. In this case, the requirement was to answer 13 questions, with answers limited to 250 words in each case.

Q1: AGI has existed for years, but the world is not dystopian and humans are still alive! Given the risks of very high-powered AI systems, how has your world ensured that AGI has at least so far remained safe and controlled?

The Global AGI safety project was one of the most momentous and challenging in human history.

The centrepiece of that project was the set of “Singularity Principles” that had first appeared in print in the book Vital Foresight in 2021, and which were developed in additional publications in subsequent years – a set of recommendations with the declared goal of increasing the likelihood that oncoming disruptive technological changes would have outcomes that are profoundly positive for humanity, rather than deeply detrimental. The principles split into four sections:

  1. A focus, in advance, on the goals and outcomes that were being sought from particular technologies
  2. Analysis of the intrinsic characteristics that are desirable in technological solutions
  3. Analysis of methods to ensure that development takes place responsibly
  4. And a meta-analysis – principles about how this overall set of recommendations could itself evolve further over time, and principles for how to increase the likelihood that these recommendations would be applied in practice rather than simply being some kind of wishful thinking.

What drove increasing support for these principles was a growing awareness, shared around the world, of the risks of cataclysmic outcomes that could arise all too easily from increasingly powerful AI, even when everyone involved had good intentions. This shared sense of danger caused even profound ideological enemies to gather together on a regular basis to review joint progress toward fulfilment of the Singularity Principles, as well as to evolve and refine these Principles.

Q2: The dynamics of an AI-filled world may depend a lot on how AI capability is distributed. In your world, is there one AI system that is substantially more powerful than all others, or a few such systems, or are there many top-tier AI systems of comparable capability? Or something else?

One of the key principles programmed into every advanced AI, from the late 2020s onward, was that no AI should seize or manipulate resources owned by any other AI. Instead, AIs should operate only with resources that have been explicitly provided to them. That prevented any hostile takeover of less capable AIs by more powerful competitors. Accordingly, a community of different AIs coexisted, with differing styles and capabilities.

However, in parallel, the various AIs naturally started to interact with each other, offering services to each other in response to expressions of need. The outcome of this interaction was a blurring of the boundaries between different AIs. Thus, by the 2040s, it was no longer meaningful to distinguish between what had originally been separate pieces of software. Instead of referring to “the Alphabet AGI” or “the Tencent AGI”, and so on, people just talked about “the AGI” or even “AGI”.

The resulting AGI was, however, put to different purposes in different parts of the world, dependent on the policies pursued by the local political leaders.

Q3: How has your world avoided major arms races and wars, regarding AI/AGI or otherwise?

The 2020s were a decade of turbulence, in which a number of arms races proceeded at pace, and when conflict several times came close to spilling over from being latent and implied (“cold”) to being active (“hot”):

  • The great cyber war of 2024 between Iran and Israel
  • Turmoil inside many countries in 2026, associated with the fall from power of the president of Russia
  • Exchanges of small numbers of missiles between North and South Korea in 2027
  • An intense cyber battle in 2028 over the future of an independent Taiwan.

These conflicts resulted in a renewed “never again” global focus to avoid any future recurrences. A new generation of political leaders resolved that, regardless of their many differences, they would put particular kinds of weapons beyond use.

Key to this “never again” commitment was an agreement on “global AI monitoring” – the use of independent narrow AIs to monitor all developments and deployments of potential weapons of mass destruction. That agreement took inspiration from previous international agreements that instituted regular independent monitoring of chemical and biological weapons.

Initial public distrust of the associated global surveillance systems was overcome, in stages, by demonstrations of the inherently trustworthy nature of the software used in these systems – software that adapted various counterintuitive but profound cryptographic ideas from the blockchain discussions of the early and mid-2020s.

Q4: In the US, EU, and China, how and where is national decision-making power held, and how has the advent of advanced AI changed that, if at all?

Between 2024 and 2032, the US switched its politics from a troubled bipolar system, with Republicans and Democrats battling each other with intense hostility, into a multi-party system, with a dynamic fluidity of new electoral groupings. The winner of the 2032 election was, for the first time since the 1850s, from neither of the formerly dominant parties. What enabled this transition was the adoption, in stages, of ranked choice voting, in which electors could indicate a sequence of which candidates they preferred. This change enabled electors to express interest in new parties without fearing their votes would be “wasted” or would inadvertently allow the election of particularly detested candidates.

The EU led the way in adoption of a “house of AI” as a reviewing body for proposed legislation. Legislation proposed by human politicians was examined by AI, resulting in suggested amendments, along with detailed explanations from the AI of reasons for making these changes. The EU left the ultimate decisions – whether or not to accept the suggestions – in the hands of human politicians. Over time, AI judgements were accepted on more and more occasions, but never uncritically.

China remained apprehensive until the mid-2030s about adopting multi-party politics with full tolerance of dissenting opinions. This apprehension was rooted in historic distrust of the apparent anarchy and dysfunction of politicians who needed to win approval of seemingly fickle electors. However, as AI evidently improved the calibre of online public discussion, with its real-time fact-checking, the Chinese system embraced fuller democratic reforms.

Q5: Is the global distribution of wealth (as measured say by national or international Gini coefficients) more, or less, unequal than 2022’s, and by how much? How did it get that way?

The global distribution of wealth became more unequal during the 2020s before becoming less unequal during the 2030s.

Various factors contributed to inequality increasing:

  • “Winner takes all”: Companies offering second-best products were unable to survive in the marketplace. Swift flows of both information and goods meant that all customers knew about better products and could easily purchase them
  • Financial rewards from the successes of companies increasingly flowed to the owners of the capital deployed, rather than to the people supplying skills and services. That’s because more of the skills and services could be supplied by automation, driving down the salaries that could be claimed by people who were offering the same skills and services
  • The factors that made some products better than others increasingly involved technological platforms, such as the latest AI systems, that were owned by a very small number of companies
  • Companies were able to restructure themselves ingeniously in order to take advantage of tax loopholes and special deals offered by countries desperate for at least some tax revenue.

What caused these trends to reverse was, in short, better politics:

  • Smart collaboration between the national governments of the world, avoiding tax loopholes
  • Recognition by greater numbers of voters of the profound merits of greater redistribution of the fruits of the remarkable abundance of NBIC technologies, as the percentage of people in work declined, and as the problems were more fully recognised of parts of society being “left behind”.

Q6: What is a major problem that AI has solved in your world, and how did it do so?

AI made many key contributions toward the solution of climate change:

  • By enabling more realistic and complete models of all aspects of the climate, including potential tipping points ahead of major climate phase transitions
  • By improving the design of alternative energy sources, including ground-based geothermal, high-altitude winds, ocean-based waves, space-based solar, and several different types of nuclear energy
  • Very significantly, by accelerating designs of commercially meaningful nuclear fusion
  • By identifying the types of “negative emissions technologies” that had the potential to scale up quickly in effectiveness
  • By accelerating the adoption of improved “cultivated meat” as sources of food that had many advantages over methods of animal-based agriculture, namely, addressing issues with land use, water use, antibiotics use, and greenhouse gas emissions, and putting an end to the vile practice of the mass slaughter of sentient creatures
  • By assisting the design of new types of cement, glass, plastics, fertilisers, and other materials whose manufacture had previously caused large emissions of greenhouse gases
  • By recommending the sorts of marketing messages that were most effective in changing the minds of previous opponents of effective action.

To be clear, AI did this as part of “NBIC convergence”, in which there are mutual positive feedback loops between progress in each of nanotech, biotech, infotech, and cognotech.

Q7: What is a new social institution that has played an important role in the development of your world?

The G7 group of the democratic countries with the largest economies transitioned in 2023 into the D16, with a sharper commitment than before to championing the core values of democracy: openness; free and fair elections; the rule of law; independent media, judiciary, and academia; power being distributed rather than concentrated; and respect for autonomous decisions of groups of people.

The D16 was envisioned from the beginning as intended to grow in size, to become a global complement to the functioning of the United Nations, able to operate in circumstances that would have resulted in a veto at the UN from countries that paid only lip service to democracy.

One of the first projects of the D16 was to revise the Universal Declaration of Human Rights from the form initially approved by the United Nations General Assembly in 1948, to take account of the opportunities and threats from new technologies, including what are known as “transhuman rights”.

In parallel, another project reached agreement on how to measure an “Index of Human Flourishing”, that could replace the economic measure GDP (Gross Domestic Product) as the de-facto principal indication of wellbeing of societies.

The group formally became the D40 in 2030 and the D90 in 2034. By that time, the D90 was central to agreements to vigorously impose an updated version of the Singularity Principles. Any group anywhere in the world – inside or outside the D90 – that sought to work around these principles, was effectively shut down due to strict economic sanctions.

Q8: What is a new non-AI technology that has played an important role in the development of your world?

Numerous fields have been transformed by atomically precise manufacturing, involving synthetic nanoscale assembly factories. These had been envisioned in various ways by Richard Feynman in 1959 and Eric Drexler in 1986, but did not become commercially viable until the early 2030s.

It had long been recognised that an “existence proof” for nanotechnology was furnished by the operation of ribosomes inside biological cells, with their systematic assembly of proteins from genetic instructions. However, creation of comparable synthetic systems needed to wait for assistance in both design and initial assembly from increasingly sophisticated AI. (DeepMind’s AlphaFold software had given an early indication of these possibilities back in 2021.) Once the process had started, significant self-improvement loops soon accelerated, with each new generation of nanotechnology assisting in the creation of a subsequent better generation.

The benefits flowed both ways: nanotech precision allowed breakthroughs in the manufacture of new types of computer hardware, including quantum computers; these in turn supported better types of AI algorithms.

Nanotech had dramatic positive impact on practices in the production of food, accommodation, clothing, and all sorts of consumer goods. Three areas particularly deserve mention:

  • Precise medical interventions, to repair damage to biological systems
  • Systems to repair damage to the environment as a whole, via a mixture of recycling and regeneration, as well as “negative emissions technologies” operating in the atmosphere
  • Clean energy sources operating at ever larger scale, including atomic-powered batteries

Q9: What changes to the way countries govern the development and/or deployment and/or use of emerging technologies (including AI), if any, played an important role in the development of your world?

Effective governance of emerging technologies involved both voluntary cooperation and enforced cooperation.

Voluntary cooperation – a desire to avoid actions that could lead to terrible outcomes – depended in turn on:

  • An awareness of the risk pathways – similar to the way that Carl Sagan and his colleagues vividly brought to the attention of world leaders in the early 1980s the potential global catastrophe of “nuclear winter”
  • An understanding that the restrictions being accepted would not hinder the development of truly beneficial products
  • An appreciation that everyone was be compelled to observe the same restrictions, and couldn’t gain some short-sighted advantage by breaching the rules.

The enforcement elements depended on:

  • An AI-powered “trustable monitoring system” that was able to detect, through pervasive surveillance, any potential violations of the published restrictions
  • Strong international cooperation, by the D40 and others, to isolate and remove resources from any maverick elements, anywhere in the world, that failed to respect these restrictions.

Public acceptance of trustable monitoring accelerated once it was understood that the systems performing the surveillance could, indeed, be trusted; they would not confer any inappropriate advantage on any grouping able to access the data feeds.

The entire system was underpinned by a vibrant programme of research and education (part of a larger educational initiative known as the “Vital Syllabus”), that:

  • Kept updating the “Singularity Principles” system of restrictions and incentives in the light of improved understanding of the risks and solutions
  • Ensured that the importance of these principles was understood both widely and deeply.

Q10: Pick a sector of your choice (education, transport, energy, communication, finance, healthcare, tourism, aerospace, materials etc.) and describe how that sector was transformed with AI in your world.

For most of human history, religion had played a pivotal role in shaping people’s outlooks and actions. Religion provided narratives about ultimate purposes. It sanctified social structures. It highlighted behaviour said to be exemplary, as demonstrated in the lives of key religious figures. And it deplored other behaviours said to lead to very bad consequences, if not in the present life, then in an assumed afterlife.

Nevertheless, the philosophical justifications for religions had come under increasing challenge in recent times, with the growth of appreciation of a scientific worldview (including evolution by natural selection), the insights from critical analysis of previously venerated scriptures, and a stark awareness of the tensions between different religions in a multi-polar world.

The decline of influence of religion had both good and bad consequences. Greater freedom of thought and action was accompanied by a shrinking of people’s mental horizons. Without the transcendent appeal of a religious worldview, people’s lives often became dominated instead by egotism or consumerism.

The growth of the transhumanist movement in the 2020s provided one counter to these drawbacks. It was not a religion in the strict sense, but its identification of solutions such as “the abolition of aging”, “paradise engineering”, and “technological resurrection” stirred deep inner personal transformations.

These transformations reached a new level thanks to AGI-facilitated encounters with religious founders, inside immersive virtual reality simulations. New hallucinogenic substances provided extra richness to these experiences. The sector formerly known as “religion” therefore experienced an unexpected renewal. Thank AGI!

Q11: What is the life expectancy of the most wealthy 1% and of the least wealthy 20% of your world; how and why has this changed since 2022?

In response to the question, “How much longer do you expect to live”, the usual answer is “at least another hundred years”.

This answer reflects a deep love of life: people are glad to be alive and have huge numbers of quests, passions, projects, and personal voyages that they are enjoying or to which they’re looking forward. The answer also reflects the extraordinary observation that, these days, very few people die. That’s true in all sectors of society, and in all countries of the world. Low-cost high-quality medical treatments are widely available, to reverse diseases that were formerly fatal, and to repair biological damage that had accumulated earlier in people’s lives. People not only live longer but become more youthful.

The core ideas behind these treatments had been clear since the mid-2020s. Biological metabolism generates as a by-product of its normal operation an assortment of damage at the cellular and intercellular levels of the body. Biology also contains mechanisms for the repair of such damage, but over time, these repair mechanisms themselves lose vitality. As a result, people manifest various so-called “hallmarks of aging”. However, various interventions involving biotech and nanotech can revitalise these repair mechanisms. Moreover, other interventions can replace entire biological systems, such as organs, with bio-synthetic alternatives that actually work better than the originals.

Such treatments were feared and even resisted for a while, by activists such as the “naturality advocates”, but the evident improvements these treatments enabled soon won over the doubters.

Q12: In the US, considering the human rights enumerated in the UN declaration, which rights are better respected and which rights are worse respected in your world than in 2022? Why? How?

In a second country of your choice, which rights are better and which rights are worse respected in your world than in 2022, and why/how?

Regarding the famous phrase, “Everyone has the right to life, liberty and security of person”, all three of these fundamental rights are upheld much more fully, around the world, in 2045 than in 2022:

  • “Life” no longer tends to stop around the age of seventy or eighty; even people aged well over one hundred look forward to continuing to enjoy the right to life
  • “Liberty” involves more choices about lifestyles, personal philosophy, morphological freedom (augmentation and variation of the physical body) and sociological freedom (new structures for families, social groupings, and self-determined nations); importantly, these are not just “choices in theory” but are “choices in practice”, since means are available to support these modifications
  • “Security” involves greater protection from hazards such as extreme weather, pandemics, criminal enterprises, infrastructure hacking, and military attacks.

These improvements in the observation of rights are enabled by technologies of abundance, operated within a much-improved political framework.

Obtaining these benefits involved people agreeing to give up various possible actions that would have led to fewer freedoms and rights overall:

  • “Rights” to pollute the environment or to inflict other negative externalities
  • “Rights” to restrict the education of their girl children
  • “Rights” to experiment with technology without a full safety analysis being concluded.

For a while, some countries like China provided their citizens with only a sham democracy, fearing an irresponsible exercise of that freedom. But by the mid-2030s, that fear had dissipated, and people in all countries gained fuller participatory rights in governance and lifestyle decisions.

Q13: What’s been a notable trend in the way that people are finding fulfilment?

For most of history, right up to the late 2020s, many people viewed themselves through the prism of their occupation or career. “I’m a usability designer”, they might have said. Or “I’m a data scientist” or “I’m a tour guide”, and so on. Their assessment of their own value was closely linked to the financial rewards they obtained from being an employee.

However, as AI became more capable of undertaking all aspects of what had previously been people’s jobs – including portions involving not only diligence and dexterity but also creativity and compassion – there was a significant decline in the proportion of overall human effort invested in employment. By the late 2030s, most people had stopped looking for paid employment, and were content to receive “universal citizens’ dividend” benefits from the operation of sophisticated automated production facilities.

Instead, more and more people found fulfilment by pursuing any of an increasing number of quests and passions. These included both solitary and collaborative explorations in music, art, mathematics, literature, and sport, as well as voyages in parts of the real world and in myriads of fascinating shared online worlds. In all these projects, people found fulfilment, not by performing better than an AI (which would be impossible), but by improving on their own previous achievements, or in friendly competition with acquaintances.

Careful prompting by the AGI helps to maintain people’s interest levels and a sense of ongoing challenge and achievement. AGI has proven to be a wonderful coach.

A year-by-year timeline to 2045

The ground rules for the worldbuilding competition were attractive:

  • The year is 2045.
  • AGI has existed for at least 5 years.
  • Technology is advancing rapidly and AI is transforming the world sector by sector.
  • The US, EU and China have managed a steady, if uneasy, power equilibrium.
  • India, Africa and South America are quickly on the ride as major players.
  • Despite ongoing challenges, there have been no major wars or other global catastrophes.
  • The world is not dystopian and the future is looking bright.

Entrants were asked to submit four pieces of work. One was a new media piece. I submitted this video:

Another required piece was:

timeline with entries for each year between 2022 and 2045 giving at least two events (e.g. “X invented”) and one data point (e.g. “GDP rises by 25%”) for each year.

The timeline I created dovetailed with the framework from the above video. Since I enjoyed creating it, I’m sharing my submission here, in the hope that it may inspire readers.

(Note: the content was submitted on 11th April 2022.)

2022

US mid-term elections result in log-jammed US governance, widespread frustration, and a groundswell desire for more constructive approaches to politics.

The collapse of a major crypto “stablecoin” results in much wider adverse repercussions than was generally expected, and a new social appreciation of the dangers of flawed financial systems.

Data point: Number of people killed in violent incidents (including homicides and armed conflicts) around the world: 590,000

2023

Fake news that is spread by social media driven by a new variant of AI provokes riots in which more than 10,000 people die, leading to much greater interest a set of “Singularity Principles” that had previously been proposed to steer the development of potentially world-transforming technologies.

G7 transforms into the D16, consisting of the world’s 16 leading democracies, proclaiming a profound shared commitment to champion norms of: openness; free and fair elections; the rule of law; independent media, judiciary, and academia; power being distributed rather than concentrated; and respect for autonomous decisions of groups of people.

Data point: Proportion of world population living in countries that are “full democracies” as assessed by the Economist: 6.4%

2024

South Korea starts a trial of a nationwide UBI scheme, in the first of what will become in later years a long line of increasingly robust “universal citizens’ dividends” schemes around the world.

A previously unknown offshoot of ISIS releases a bioengineered virus. Fortunately, vaccines are quickly developed and deployed against it. In parallel, a bitter cyber war takes place between Iran and Israel. These incidents lead to international commitments to prevent future recurrences.

Data point: Proportion of people of working age in US who are not working and who are not looking for a job: 38%

2025

Extreme weather – floods and storms – kills 10s of 1000s in both North America and Europe. A major trial of geo-engineering is rushed through, with reflection of solar radiation in the stratosphere – causing global political disagreement and then a renewed determination for tangible shared action on climate change.

The US President appoints a Secretary for the Future as a top-level cabinet position. More US states adopt rank choice voting, allowing third parties to grow in prominence.

Data point: Proportion of earth’s habitable land used to rear animals for human food: 38%

2026

A song created entirely by an AI tops the hit parade, and initiates a radical new musical genre.

Groundswell opposition to autocratic rule in Russia leads to the fall from power of the president and a new dedication to democracy throughout countries formerly perceived as being within Russia’s sphere of direct influence.

Data point: Net greenhouse gas emissions (including those from land-use changes): 59 billion tons of CO2 equivalent – an unwelcome record.

2027

Metformin approved for use as an anti-aging medicine in a D16 country. Another D16 country recommends nationwide regular usage of a new nootropic drug.

Exchanges of small numbers of missiles between North and South Korea leads to regime change inside North Korea and a rapprochement between the long-bitter enemies.

Data point: Proportion of world population living in countries that are “full democracies” as assessed by the Economist: 9.2%

2028

An innovative nuclear fusion system, with its design assisted by AI, runs for more than one hour and generates significantly more energy out than what had been put in.

As a result of disagreements about the future of an independent Taiwan, an intense destructive cyber battle takes place. At the end, the nations of the world commit more seriously than before to avoiding any future cyber battles.

Data point: Proportion of world population experiencing mental illness or dissatisfied with the quality of their mental health: 41%

2029

A trial of an anti-aging intervention in middle-aged dogs is confirmed to have increased remaining life expectancy by 25% without causing any adverse side effects. Public interest in similar interventions in humans skyrockets.

The UK rejoins a reconfigured EU, as an indication of support for sovereignty that is pooled rather than narrow.

Data point: Proportion of world population with formal cryonics arrangements: 1 in 100,000

2030

Russia is admitted into the D40 – a newly expanded version of the D16. The D40 officially adopts “Index of Human Flourishing” as more important metric than GDP, and agrees a revised version of the Universal Declaration of Human Rights, brought up to date with transhuman issues.

First permanent implant in a human of an artificial heart with a new design that draws all required power from the biology of the body rather than any attached battery, and whose pace of operation is under the control of the brain.

Data point: Net greenhouse gas emissions (including those from land-use changes): 47 billion tons of CO2 equivalent – a significant improvement

2031

An AI discovers and explains a profound new way of looking at mathematics, DeepMath, leading in turn to dramatically successful new theories of fundamental physics.

Widespread use of dynamically re-programmed nanobots to treat medical conditions that would previously have been fatal.

Data point: Proportion of world population regularly taking powerful anti-aging medications: 23%

2032

First person reaches the age of 125. Her birthday celebrations are briefly disrupted by a small group of self-described “naturality advocates” who chant “120 is enough for anyone”, but that group has little public support.

D40 countries put in place a widespread “trustable monitoring system” to cut down on existential risks (such as spread of WMDs) whilst maintaining citizens’ trust.

Data point: Proportion of world population living in countries that are “full democracies” as assessed by the Economist: 35.7% 

2033

For the first time since the 1850s, the US President comes from a party other than Republican and Democratic.

An AI system is able to convincingly pass the Turing test, impressing even the previous staunchest critics with its apparent grasp of general knowledge and common sense. The answers it gives to questions of moral dilemmas also impress previous sceptics.

Data point: Proportion of people of working age in US who are not working and who are not looking for a job: 58%

2034

The D90 (expanded from the D40) agrees to vigorously impose Singularity Principles rules to avoid inadvertent creation of dangerous AGI.

Atomically precise synthetic nanoscale assembly factories have come of age, in line with the decades-old vision of nanotechnology visionary Eric Drexler, and are proving to have just as consequential an impact on human society as AI.

Data point: Net greenhouse gas *removals*: 10 billion tons of CO2 equivalent – a dramatic improvement

2035

A novel written entirely by an AI reaches the top of the New York Times bestseller list, and is widely celebrated as being the finest piece of literature ever produced.

Successful measures to remove greenhouse gases from the atmosphere, coupled with wide deployment of clean energy sources, lead to a declaration of “victory over runaway climate change”.

Data point: Proportion of earth’s habitable land used to rear animals for human food: 4%

2036

A film created entirely by an AI, without any real human actors, wins Oscar awards.

The last major sceptical holdout, a philosophy professor from an Ivy League university, accepts that AGI now exists. The pope gives his blessing too.

Data point: Proportion of world population with cryonics arrangements: 24%

2037

The last instances of the industrial scale slaughter of animals for human consumption, on account of the worldwide adoption of cultivated (lab-grown) meat.

AGI convincingly explains that it is not sentient, and that it has a very different fundamental structure from that of biological consciousness.

Data point: Proportion of world population who are literate: 99.3%

2038

Rejuvenation therapies are in wide use around the world. “Eighty is the new fifty”. First person reaches the age of 130.

Improvements made by AGI upon itself effectively raise its IQ one hundred fold, taking it far beyond the comprehension of human observers. However, the AGI provides explanatory educational material that allows people to understand vast new sets of ideas.

Data point: Proportion of world population who consider themselves opposed to AGI: 0.1%

2039

An extensive set of “vital training” sessions has been established by the AGI, with all citizens over the age of ten participating for a minimum of seven hours per day on 72 days each year, to ensure that humans develop and maintain key survival skills.

Menopause reversal is common place. Women who had long ago given up any ideas of bearing another child happily embrace motherhood again.

Data point: Proportion of world population regularly taking powerful anti-aging medications: 99.2%

2040

The use of “mind phones” is widespread: new brain-computer interfaces that allow communication between people by mental thought alone.

People regularly opt to have several of their original biological organs replaced by synthetic alternatives that are more efficient, more durable, and more reliable.

Data point: Proportion of people of working age in US who are not working and who are not looking for a job: 96%

2041

Shared immersive virtual reality experiences include hyper-realistic simulations of long-dead individuals – including musicians, politicians, royalty, saints, and founders of religions.

The number of miles of journey undertaken by small “flying cars” exceeds that of ground-based powered transport.

Data point: Proportion of world population living in countries that are “full democracies” as assessed by the Economist: 100.0%

2042

First successful revival of mammal from cryopreservation.

AGI presents a proof of the possibility of time travel, but the resources required for safe transit of humans through time would require the equivalent of building a Dyson sphere around the sun.

Data point: Proportion of world population experiencing mental illness or dissatisfied with the quality of their mental health: 0.4%

2043

First person reaches the age of 135, and declares herself to be healthier than at any time in the preceding four decades.

As a result of virtual reality encounters of avatars of founders of religion, a number of new systems of philosophical and mystical thinking grow in popularity.

Data point: Proportion of world’s energy provided by earth-based nuclear fusion: 75%

2044

First human baby born from an ectogenetic pregnancy.

Family holidays on the Moon are an increasingly common occurrence.

Data point: Average amount of their waking time that people spend in a metaverse: 38%

2045

First revival of human from cryopreservation – someone who had been cryopreserved ten years previously.

Subtle messages decoded by AGI from far distant stars in the galaxy confirm that other intelligent civilisations exist, and are on their way to reveal themselves to humanity.

Data point: Number of people killed in violent incidents around the world: 59

Postscript

My thanks go to the competition organisers, the Future of Life Institute, for providing the inspiration for the creation of the above timeline.

Readers are likely to have questions in their minds as they browse the timeline above. More details of the reasoning behind the scenarios involved are contained in three follow-up posts:

Blog at WordPress.com.