2030 | dw2

2 September 2023

Bletchley Park: Seven dangerous failure modes – and how to avoid them

Filed under: Abundance, AGI, Events, leadership, London Futurists — Tags: 2030, AI-30, Bletchley Park — David Wood @ 7:13 am

An international AI Safety Summit is being held on 1st and 2nd November at the historic site of Bletchley Park, Buckinghamshire. It’s convened by none other than the UK’s Prime Minister, Rishi Sunak.

It’s a super opportunity for a much-needed global course correction in humanity’s relationship with the fast-improving technology of AI (Artificial Intelligence), before AI passes beyond our understanding and beyond our control.

But when we look back at the Summit in, say, two years time, will we assess it as an important step forward, or as a disappointing wasted opportunity?

(Image credit: this UK government video)

On the plus side, there are plenty of encouraging words in the UK government’s press release about the Summit:

International governments, leading AI companies and experts in research will unite for crucial talks in November on the safe development and use of frontier AI technology, as the UK Government announces Bletchley Park as the location for the UK summit.

The major global event will take place on the 1st and 2nd November to consider the risks of AI, especially at the frontier of development, and discuss how they can be mitigated through internationally coordinated action. Frontier AI models hold enormous potential to power economic growth, drive scientific progress and wider public benefits, while also posing potential safety risks if not developed responsibly.

To be hosted at Bletchley Park in Buckinghamshire, a significant location in the history of computer science development and once the home of British Enigma codebreaking – it will see coordinated action to agree a set of rapid, targeted measures for furthering safety in global AI use.

Nevertheless, I’ve seen several similar vital initiatives get side-tracked in the past. When we should be at our best, we can instead be overwhelmed by small-mindedness, by petty tribalism, and by obsessive political wheeling and dealing.

Since the stakes are so high, I’m compelled to draw attention, in advance, to seven ways in which this Summit could turn out to be a flop.

My hope is that my predictions will become self non-fulfilling.

1.) Preoccupation with easily foreseen projections of today’s AI

It’s likely that AI in just 2-3 years will possess capabilities that surprise even the most far-sighted of today’s AI developers. That’s because, as we build larger systems of interacting artificial neurons and other computational modules, the resulting systems are displaying unexpected emergent features.

Accordingly, these systems are likely to possess new ways (and perhaps radically new ways) of:

Observing and forecasting
Spying and surveilling
Classifying and targeting
Manipulating and deceiving.

But despite their enhanced capabilities, these systems may still on occasion miscalculate, hallucinate, overreach, suffer from bias, or fail in other ways – especially if they can be hacked or jail-broken.

Just because some software is super-clever, it doesn’t mean it’s free from all bugs, race conditions, design blind spots, mistuned configurations, or other defects.

What this means is that the risks and opportunities of today’s AI systems – remarkable as they are – will likely be eclipsed by the risks and opportunities of the AI systems of just a few years’ time.

A seemingly unending string of pundits are ready to drone on and on about the risks and opportunities of today’s AI systems. Yes, these conversations are important. However, if the Summit becomes preoccupied by those conversations, and gives insufficient attention to the powerful disruptive new risks and opportunities that may arise shortly afterward, it will have failed.

2.) Focusing only on innovation and happy talk

We all like to be optimistic. And we can tell lots of exciting stories about the helpful things that AI systems will be able to do in the near future.

However, we won’t be able to receive these benefits if we collectively stumble before we get there. And the complications of next generation AI systems mean that a number of dimly understood existential landmines stand in our way:

If the awesome powers of new AI are used for malevolent purposes by bad actors of various sorts
If an out-of-control race between well-meaning competitors (at either the commercial or geopolitical level) results in safety corners being cut, with disastrous consequences
If perverse economic or psychological incentives lead people to turn a blind eye to risks of faults in the systems they create
If an AI system that has an excellent design and implementation is nevertheless hacked into a dangerous alternative mode
If an AI system follows its own internal logic to conclusions very different from what the system designers intended (this is sometimes described as “the AI goes rogue”).

In short, too much happy talk, or imprecise attention to profound danger modes, will cause the Summit to fail.

3.) Too much virtue signalling

One of the worst aspects of meetings about the future of AI is when attendees seem to enter a kind of virtue competition, uttering pious phrases such as:

“We believe AI must be fair”
“We believe AI must be just”
“We believe AI must avoid biases”
“We believe AI must respect human values”

This is like Nero fiddling whilst Rome burns.

What the Summit must address are the very tangible threats of AI systems being involved in outcomes much worse than groups of individuals being treated badly. What’s at stake here is, potentially, the lives of hundreds of millions of people – perhaps more – depending on whether an AI-induced catastrophe occurs.

The Summit is not the place for holier-than-thou sanctimonious puff. Facilitators should make that clear to all participants.

4.) Blindness to the full upside of next generation AI

Whilst one failure mode is to underestimate the scale of catastrophic danger that next generation AI might unleash, another failure mode is to underestimate the scale of profound benefits that next generation AI could provide.

What’s within our grasp isn’t just a potential cure for, say, one type of cancer, but a potential cure for all chronic diseases, via AI-enabled therapies that will comprehensively undo the biological damage throughout our bodies that we normally call aging.

Again, what’s within our grasp isn’t just ways to be more efficient and productive at work, but ways in which AI will run the entire economy on our behalf, generating a sustainable superabundance for everyone.

Therefore, at the same time as huge resources are being marshalled on two vital tasks:

The creation of AI superintelligence
The creation of safe AI superintelligence

we should also keep clearly in mind one additional crucial task:

The creation of AI superbenevolence

5.) Accepting the wishful thinking of Big Tech representatives

As Upton Sinclair highlighted long ago, “It is difficult to get a man to understand something, when his salary depends on his not understanding it.”

The leadership of Big Tech companies are generally well-motivated: they want their products to deliver profound benefits to humanity.

Nevertheless, they are inevitably prone to wishful thinking. In their own minds, their companies will never make the kind of gross errors that happened at, for example, Union Carbide (Bhopal disaster), BP (Deepwater Horizon disaster), NASA (Challenger and Columbia shuttle disasters), or Boeing (737 Max disaster).

But especially in times of fierce competition (such as the competition to be the web’s preferred search tool, with all the vast advertising revenues arising), it’s all too easy for these leaders to turn a blind eye, probably without consciously realising it, to significant disaster possibilities.

Accordingly, there must be people at the Summit who are able to hold these Big Tech leaders to sustained serious account.

Agreements for “voluntary” self-monitoring of safety standards will not be sufficient!

6.) Not engaging sufficiently globally

If an advanced AI system goes wrong, it’s unlikely to impact just one country.

Given the interconnectivity of the world’s many layers of infrastructure, it’s critical that the solutions proposed by the Summit have a credible roadmap to adoption all around the world.

This is not a Summit where it will be sufficient to persuade the countries who are already “part of the choir”.

I’m no fan of diversity-for-diversity’s-sake. But on this occasion, it will be essential to transcend the usual silos.

7.) Insufficient appreciation of the positive potential of government

One of the biggest myths of the last several decades is that governments can make only a small difference, and that the biggest drivers for lasting change in the world are other forces, such as the free-market, military power, YouTube influencers, or popular religious sentiment.

On the contrary, with a wise mix of incentives and restrictions – subsidies and penalties – government can make a huge difference in the well-being of society.

Yes, national industrial policy often misfires, due to administrative incompetence. But there are better examples, where inspirational government leadership transformed the entire operating environment.

The best response to the global challenge of next generation AI will involve a new generation of international political leaders demonstrating higher skills of vision, insight, agility, collaboration, and dedication.

This is not the time for political lightweights, blowhards, chancers, or populist truth-benders.

Footnote: The questions that most need to be tabled

London Futurists is running a sequence of open surveys into scenarios for the future of AI.

Round one has concluded. Round two has just gone live (here).

I urge everyone concerned about the future of AI to take a look at that new survey, and to enter their answers and comments into the associated Google Form.

That’s a good way to gain a fuller appreciation of the scale of the issues that should be considered at Bletchley Park.

That will reduce the chance that the Summit is dominated by small-mindedness, by petty tribalism, or by politicians merely seeking a media splash. Instead, it will raise the chance that the Summit seriously addresses the civilisation-transforming nature of next generation AI.

Finally, see here for an extended analysis of a set of principles that can underpin a profoundly positive relationship between humanity and next generation AI.

dw2

2 September 2023

Bletchley Park: Seven dangerous failure modes – and how to avoid them

1.) Preoccupation with easily foreseen projections of today’s AI

2.) Focusing only on innovation and happy talk

3.) Too much virtue signalling

4.) Blindness to the full upside of next generation AI

5.) Accepting the wishful thinking of Big Tech representatives

6.) Not engaging sufficiently globally

7.) Insufficient appreciation of the positive potential of government

Footnote: The questions that most need to be tabled

Pages

Recent Posts

Archives

Recent Comments

Categories

Email Subscription