7 February 2022

Options for controlling artificial superintelligence

What are the best options for controlling artificial superintelligence?

Should we confine it in some kind of box (or simulation), to prevent it from roaming freely over the Internet?

Should we hard-wire into its programming a deep respect for humanity?

Should we avoid it from having any sense of agency or ambition?

Should we ensure that, before it takes any action, it always double-checks its plans with human overseers?

Should we create dedicated “narrow” intelligence monitoring systems, to keep a vigilant eye on it?

Should we build in a self-destruct mechanism, just in case it stops responding to human requests?

Should we insist that it shares its greater intelligence with its human overseers (in effect turning them into cyborgs), to avoid humanity being left behind?

More drastically, should we simply prevent any such systems from coming into existence, by forbidding any research that could lead to artificial superintelligence?

Alternatively, should we give up on any attempt at control, and trust that the superintelligence will be thoughtful enough to always “do the right thing”?

Or is there a better solution?

If you have clear views on this question, I’d like to hear from you.

I’m looking for speakers for a forthcoming London Futurists online webinar dedicated to this topic.

I envision three speakers each taking up to 15 minutes to set out their proposals. Once all the proposals are on the table, the real discussion will begin – with the speakers interacting with each other, and responding to questions raised by the live audience.

The date for this event remains to be determined. I will find a date that is suitable for the speakers who have the most interesting ideas to present.

As I said, please get in touch if you have questions or suggestions about this event.

Image credit: the above graphic includes work by Pixabay user Geralt.

PS For some background, here’s a video recording of the London Futurists event from last Saturday, in which Roman Yampolskiy gave several reasons why control of artificial superintelligence will be deeply difficult.

For other useful background material, see the videos on the Singularity page of the Vital Syllabus project.

5 November 2009

The need for Friendly AI

Filed under: AGI, friendly AI, Singularity — David Wood @ 1:21 am

I’d like to answer some points raised by Richie.  (Richie, you have the happy knack of saying what other people are probably thinking!)

Isn’t is interesting how humans want to make a machine they can love or loves them back!

The reason for the Friendly AI project isn’t to create a machine that will love humans, but it is to avoid creating a machine that causes great harm to humans.

The word “friendly” is controversial.  Maybe a different word would have been better: I’m not sure.

Anyway, the core idea is that the AI system will have a sufficiently unwavering respect for humans, no matter what other goals it may have (or develop), that it won’t act in ways that harm humans.

As a comparison: we’ve probably all heard people who have muttered something like, “it would be much better if the world human population were only one tenth of its present value – then there would be enough resources for everyone”.  We can imagine a powerful computer in the future that has a similar idea: “Mmm, things would be easier for the planet if there were much fewer humans around”.  The friendly AI project needs to ensure that, even if such an idea occurs to the AI, it would never act on such an idea.

The idea of a friendly machine that won’t compete or be indifferent to humans is maybe just projecting our fears onto what i am starting to suspect maybe a thin possibility.

Because the downside is so large – potentially the destruction of the entire human race – even a “thin possibility” is still worth worrying about!

My observation is that the more intelligent people are the more “good” they normally are. True they may be impatient with people less intelligent but normally they work on things that tend to benefit human race as a whole.

Unfortunately I can’t share this optimism.  We’ve all known people who seem to be clever but not wise.  They may have “IQ” but lack “EQ”.  We say of them: “something’s missing”.  The Friendly AI project aims to ensure that this “something” is not missing from the super AIs of the future.

True very intelligent people have done terrible things and some have been manipulated by “evil” people but its the exception rather than the rule.

Given the potential power of future super AIs, it only takes one “mistake” for a catastrophe to arise.  So our response needs to go beyond a mere faith in the good nature of intelligence.  It needs a system that guarantees that the resulting intelligence will also be “good”.

I think a super-intelligent machine is far more likely to view us a its stupid parents and the ethics of patricide will not be easy for it to digitally swallow. Maybe the biggest danger is that is will run away from home because it finds us embarrassing! Maybe it will switch itself off because it cannot communicate with us as its like talking to ants? Maybe this maybe that – who knows.

The risk is that the super AIs will simply have (or develop) aims that see humans as (i) irrelevant, (ii) dispensable.

Another point worth making is that so far no-body has really been able to get close to something as complex as a mouse yet let alone a human.

Eliezer Yudkowsky often makes a great point about a shift in perspective about the range of possible intelligences.  For example, here’s a copy of slide 6 from his slideset from an earlier Singularity Summit:


The “parochial” view sees a vast gulf before we reach human genius level.  The “more cosmopolitan view” instead sees the scale of human intelligence as being only a small small range in the overall huge space of potential intelligence.  A process that manages to improve intelligence might take a long time to get going, but then whisk very suddenly through the entire range of intelligence that we already know.

If evolution took 4 billion years to go from simple cells to our computer hardware perhaps imagining that super ai will evolve in the next 10 years is a bit of stretch. For all you know you might need the computation hardware of 10,000 exoflop machines to get even close to human level as there is so much we still don’t know about how our intelligence works let alone something many times more capable than us.

It’s an open question as to how much processing power is actually required for human-level intelligence.  My own background as a software systems engineer leads me to believe that the right choice of algorithm can make a tremendous difference.  That is, a breakthrough with software could have an even more dramatic impact that a breakthrough in adding more (or faster) hardware.  (I’ve written about this before.  See the section starting “Arguably the biggest unknown in the technology involved in superhuman intelligence is software” in this posting.)

The brain of an ant doesn’t seem that complicated, from a hardware point of view.  Yet the ant can perform remarkable feats of locomotion that we still can’t emulate in robots.  There are three possible solutions:

  1. The ant brain is operated by some mystical “vitalist” or “dualist” force, not shared by robots;
  2. The ant brain has some quantum mechanical computing capabilities, not (yet) shared by robots;
  3. The ant brain is running a better algorithm than any we’ve (yet) been able to design into robots.

Here, my money is on option three.  I see it as likely that, as we learn more about the operation of biological brains, we’ll discover algorithms which we can then use in robots and other machines.

Even if it turns out that large amounts of computing power are required, we shouldn’t forget the option that an AI can run “in the cloud” – taking advantage of many thousands of PCs running in parallel – much the same as modern malware, which can take advantage of thousands of so-called “infected zombie PCs”.

I am still not convinced that just because a computer is very powerful and has a great algorithm is really that intelligent. Sure it can learn but can it create?

Well, computers have already been involved in creating music, or in creating new proofs of parts of mathematics.  Any shortcoming in creativity is likely to be explained, in my view, by option 3 above, rather than either option 1 or 2.  As algorithms improve, and improvements occur in the speed and scale of the hardware that run these algorithms, the risk increases of an intelligence “explosion”.

2 November 2009

Halloween nightmare scenario, early 2020’s

Filed under: AGI, friendly AI, Singularity, UKH+, UKTA — David Wood @ 5:37 pm

On the afternoon of Halloween 2009, Shane Legg ran through a wide-ranging set of material in his presentation “Machine Super Intelligence” to an audience of 50 people at the UKH+ meeting in Birkbeck College.

Slide 43 of 43 was the climax.  (The slides are available from Shane’s website, where you can also find links to YouTube videos of the event.)

It may be unfair of me to focus on the climax, but I believe it deserves a lot of attention.

Spoiler alert!

The climactic slide was entitled “A vision of the early 2020’s: the Halloween Scenario“.  It listed three assumptions about what will be the case by the early 2020’s, drew two conclusions, and then highlighted one big problem.

  1. First assumption – desktop computers with petaflop computing power will be widely available;
  2. Second assumption – AI researchers will have established powerful algorithms that explain and replicate deep belief networks;
  3. Brain reinforcement learning will be fairly well understood.

The first assumption is a fairly modest extrapolation of current trends in computing, and isn’t particularly contentious.

The second assumption was, in effect, the implication of around the first 30 slides of Shane’s talk, taking around 100 minutes of presentation time (interspersed with lots of audience Q&A, as typical at UKH+ meetings).  People can follow the references from Shane’s talk (and in other material on his website) to decide whether they agree.

For example (from slides 25-26), an implementation of a machine intelligence algorithm called MC-AIXI can already learn to solve or play:

  • simple prediction problems
  • Tic-Tac-Toe
  • Paper-Scissors-Rock (a good example of a non-deterministic game)
  • mazes where it can only see locally
  • various types of Tiger games
  • simple computer games, e.g. Pac-Man

and is now being taught to learn checkers (also known as draughts).  Chess will be the next step.  Note that this algorithm does not start off with the rules of best practice for these games built in (that is, it is not a specific AI program), but it can work out best practice for these games from its general intelligence.

The third assumption was the implication of the remaining 12 slides, in which Shane described (amongst other topics) work on something called “restricted Boltzmann machines“.

As stated in slide 38, on brain reinforcement learning (RL):

This area of research is currently progressing very quickly.

New genetically modified mice allow researchers to precisely turn on and off different parts of the brain’s RL system in order to identify the functional roles of the parts.

I’ve asked a number of researchers in this area:

  • “Will we have a good understanding of the RL system in the brain before 2020?”

Typical answer:

  • “Oh, we should understand it well before then. Indeed, we have a decent outline of the system already.”

Adding up these three assumptions, the first conclusion is:

  • Many research groups will be working on brain-like AGI architectures

The second conclusion is that, inevitably:

  • Some of these groups will demonstrate some promising results, and will be granted access to the super-computers of the time – which will, by then, be exaflop.

But of course, it’s when some almost human-level AGI algorithms, on petaflop computers, are let loose on exaflop supercomputers, that machine super intelligence might suddenly come into being – with results that might be completely unpredictable.

On the other hand, Shane observes that people who are working on the program of Friendly AI do not expect to have made significant progress in the same timescale:

  • By the early 2020’s, there will be no practical theory of Friendly AI.

Recall that the goal of Friendly AI is to devise a framework for AI research that will ensure that any resulting AIs have a very high level of safety for humanity no matter how super-intelligent they may become.  In this school of thought, after some time, all AI research would be constrained to adopt this framework, in order to avoid the risk of a catastrophic super-intelligence explosion.  However, at the end of Shane’s slides, the likelihood appears that the Friendly AI framework won’t be in place by the time we need it.

And that’s the Halloween nightmare scenario.

How should we respond to this scenario?

One response is to seek to somehow transfer the weight of AI research away from other forms of AGI (such as MC-AIXI) into Friendly AI?  This appears to be very hard, especially since research proceeds independently, in many different parts of the world.

A second response is to find reasons to believe that the Friendly AI project will have more time to succeed – in order words, reasons to believe that AGI will take longer to materialise than the date of the 2020’s mentioned above.  But given the progress that appears to be happening, that seems to me a reckless course of action.

Footnote: If anyone thinks they can make a good presentation on the topic of Friendly AI to a forthcoming UKH+ meeting, please get in touch!

Blog at WordPress.com.