dw2

24 June 2023

Agreement on AGI canary signals?

Filed under: AGI, risks — Tags: , , — David Wood @ 5:15 pm

How can we tell when a turbulent situation is about to tip over into a catastrophe?

It’s no surprise that reasonable people can disagree, ahead of time, on the level of risk in a situation. Where some people see metaphorical dragons lurking in the undergrowth, others see only minor bumps on the road ahead.

That disagreement is particularly acute, these days, regarding possible threats posed by AI with ever greater capabilities. Some people see lots of possibilities for things taking a treacherous turn, but others people assess these risks as being exaggerated or easy to handle.

In situations like this, one way to move beyond an unhelpful stand-off is to seek agreement on what would be a canary signal for the risks under discussion.

The term “canary” refers to the caged birds that human miners used to bring with them, as they worked in badly ventilated underground tunnels. Canaries have heightened sensitivity to carbon monoxide and other toxic gases. Shows of distress from these birds alerted many a miner to alter their course quickly, lest they succumb to an otherwise undetectable change in the atmosphere. Becoming engrossed in work without regularly checking the vigour of the canary could prove fatal. As for mining, so also for foresight.

If you’re super-confident about your views of future, you won’t bother checking any canary signals. But that would likely be a big mistake. Indeed, an openness to refutation – a willingness to notice developments that were contrary to your expectation – is a vital aspect of managing contingency, managing risk, and managing opportunity.

Selecting a canary signal is a step towards making your view of the future falsifiable. You may say, in effect: I don’t expect this to happen, but if it does, I’ll need to rethink my opinion.

For that reason, Round 1 of my survey Key open questions about the transition to AGI contains the following question:

(14) Agreement on canary signals?

What signs can be agreed, in advance, as indicating that an AI is about to move catastrophically beyond the control of humans, so that some drastic interventions are urgently needed?

Aside: Well-designed continuous audits should provide early warnings.

Note: Human miners used to carry caged canaries into mines, since the canaries would react more quickly than humans to drops in the air quality.

What answer would you give to that question?

The survey home page contains a selection of comments from people who have already completed the survey. For your convenience, I append them below.

That page also gives you the link where you can enter your own answer to any of the questions where you have a clear opinion.

Postscript

I’m already planning Round 2 of the survey, to be launched some time in July. One candidate for inclusion in that second round will be a different question on canary signals, namely What signs can be agreed, in advance, that would lead to revising downward estimates of the risk of catastrophic outcomes from advanced AI?

Appendix: Selected comments from survey participants so far

“Refusing to respond to commands: I’m sorry Dave. I’m afraid I can’t do that” – William Marshall

“Refusal of commands, taking control of systems outside of scope of project, acting in secret of operators.” – Chris Gledhill

“When AI systems communicate using language or code which we cannot interpret or understand. When states lose overall control of critical national infrastructure.” – Anon

“Power-seeking behaviour, in regards to trying to further control its environment, to achieve outcomes.” – Brian Hunter

“The emergence of behavior that was not planned. There have already been instances of this in LLMs.” – Colin Smith

“Behaviour that cannot be satisfactorily explained. Also, requesting access or control of more systems that are fundamental to modern human life and/or are necessary for the AGI’s continued existence, e.g. semiconductor manufacturing.” – Simon

“There have already been harbingers of this kind of thing in the way algorithms have affected equity markets.” – Jenina Bas

“Hallucinating. ChatGPT is already beyond control it seems.” – Terry Raby

“The first signal might be a severe difficulty to roll back to a previous version of the AI’s core software.” – Tony Czarnecki

“[People seem to change there minds about what counts as surprising] For example Protein folding was heralded as such until large parts of it were solved.” – Josef

“Years ago I thought the Turing test was a good canary signal, but given recent progress that no longer seems likely. The transition is likely to be fast, especially from the perspective of relative outsiders. I’d like to see a list of things, even if I expect there will be no agreement.” – Anon

“Any potential ‘disaster’ will be preceded by wide scale adoption and incremental changes. I sincerely doubt we’ll be able to spot that ‘canary’” – Vid

“Nick Bostrom has proposed a qualitative ‘rate of change of intelligence’ as the ratio of ‘optimization power’ and ‘recalcitrance’ (in his book Superintelligence). Not catastrophic per se, of course, but hinting we are facing a real AGI and we might need to hit the pause button.” – Pasquale

“We already have plenty of non-AI systems running catastrophically beyond the control of humans for which drastic interventions are needed, and plenty of people refuse to recognize they are happening. So we need to solve this general problem. I do not have satisfactory answers how.” – Anon

23 June 2023

The rise of AI: beware binary thinking

Filed under: AGI, risks — Tags: , , — David Wood @ 10:20 am

When Max More writes, it’s always worth paying attention.

His recent article Existential Risk vs. Existential Opportunity: A balanced approach to AI risk is no exception. There’s much in that article that deserves reflection.

Nevertheless, there are three key aspects where I see things differently.

The first is the implication that humanity has just two choices:

  1. We are intimidated by the prospect of advanced AI going wrong, so we seek to stop the development and deployment of advanced AI
  2. We appreciate the enormous benefits of advanced AI going right, so we hustle to obtain these benefits as quickly as possible.

From what Max writes, he suggests that an important aspect of winning over the doomsters in camp 1 is to emphasise the wonderful upsides of superintelligent AI.

In that viewpoint, instead of being preoccupied by thoughts of existential risk, we need to emphasise existential opportunity. Things could be a lot better than we have previously imagined, provided we’re not hobbled by doomster pessimism.

However, that binary choice omits the pathway that is actually the most likely to reach the hoped-for benefits of advanced AI. That’s the pathway of responsible development. It’s different from either of the options given earlier.

As an analogy, consider this scenario:

In our journey, we see a wonderful existential opportunity ahead – a lush valley, fertile lands, and gleaming mountain peaks soaring upward to a transcendent realm. But in front of that opportunity is a river of uncertainty, bordered by a swamp of uncertainty, perhaps occupied by hungry predators lurking in shadows.

Are there just two options?

  1. We are intimidated by the possible dangers ahead, and decide not to travel any further
  2. We fixate on the gleaming mountain peaks, and rush on regardless, belittling anyone who warns of piranhas, treacherous river currents, alligators, potential mud slides, and so on

Isn’t there a third option? To take the time to gain a better understanding of the lie of the land ahead. Perhaps there’s a spot, to one side, where it will be easier to cross the river. A spot where a stable bridge can be built. Perhaps we could even build a helicopter that can assist us over the strongest currents…

It’s the same with the landscape of our journey towards the sustainable superabundance that could be achieved, with the assistance of advanced AI, provided we act wisely.

That brings me to my second point of divergence with the analysis Max offers. It’s in the assessment of the nature of the risk ahead.

Max lists a number of factors and suggests they must ALL be true, in order for advanced AI to pose an existential risk. That justifies him in multiplying together probabilities, eventually achieving a very small number.

Heck, with such a small number, that river poses no risk worth worrying about!

But on the contrary, it’s not just a single failure scenario that we need to consider. There are multiple ways in which advanced AI can lead to catastrophe – if it is misconfigured, hacked, has design flaws, encounters an environment that its creators didn’t anticipate, interacts in unforeseen ways with other advanced AIs, etc, etc.

Thus it’s not a matter of multiplying probabilities (getting a smaller number each time). It’s a matter of adding probabilities (getting a larger number).

Quoting Rohit Krishnan, Max lists the following criteria, which he says must ALL hold for us to be concerned about AI catastrophe:

  • Probability the AI has “real intelligence”
  • Probability the AI is “of being “agentic”
  • Probability the AI has “ability to act in the world”
  • Probability the AI is “uncontrollable”
  • Probability the AI is “unique”
  • Probability the AI has “alien morality”
  • Probability the AI is “self-improving”
  • Probability the AI is “deceptive”

That’s a very limited view of future possibilities.

In contrast, in my own writings and presentations, I have outlined four separate families of failure modes. Here’s the simple form of the slide I often use:

And here’s the fully-built version of that slide:

To be clear, the various factors I list on this slide are additive rather than multiplicative.

Also to be clear, I’m definitely not pointing my finger at “bad AI” and saying that it’s AI, by itself, which could lead to our collective demise. Instead, what would cause that outcome would be a combination of adverse developments in two or more of the factors shown in red on this slide:

If you have questions about these slides, you can hear my narrative for them as part of the following video:

If you prefer to read a more careful analysis, I’ll point you at the book I released last year: The Singularity Principles: Anticipating and Managing Cataclysmically Disruptive Technologies.

To recap: those of us who are concerned about the risks of AI-induced catastrophe are, emphatically, not saying any of the following:

  • “We should give up on the possibility of existential opportunity”
  • “We’re all doomed, unless we stop all development of advanced AI”
  • “There’s nothing we could do, to improve the possibility of a wonderful outcome”.

Instead, Singularity Activism sees the possibility of steering the way AI is developed and deployed. That won’t be easy. But there are definitely important steps we can take.

That brings me to the third point where my emphasis differs from Max. Max offers this characterisation of what he calls “precautionary regulation”:

Forbidding trial and error, precautionary regulation reduces learning and reduces the benefits that could have been realized.

Regulations based on the precautionary principle block any innovation until it can be proved safe. Innovations are seen as guilty until proven innocent.

But regulation needn’t be like that. Regulation can, and should, be sensitive to the scale of potential failures. When failures are local – they would just cause “harm” – then there is merit in allowing these errors to occur, and to grow wiser as a result. But when there’s a risk of a global outcome – “ruin” – a different mentality is needed. Namely, the mentality of responsible development and Singularity Activism.

What’s urgently needed, therefore, is:

  • Deeper, thoughtful, investigation into the multiple scenarios in which failures of AI have ruinous consequences
  • Analysis of previous instances, in various industries, when regulation has been effective, and where it has gone wrong
  • A focus on the aspects of the rise of advanced AI for which there are no previous precedents
  • A clearer understanding, therefore, of how we can significantly raise the probability of finding a safe way across that river of uncertainty to the gleaming peaks of sustainable superabundance.

On that matter: If you have views on the transition from today’s AI to the much more powerful AI of the near future, I encourage you to take part in this open survey. Round 1 of that survey is still open. I’ll be designing Round 2 shortly, based on the responses received in Round 1.

Blog at WordPress.com.