In his presentation last week at the UKH+ meeting “The Friendly AI Problem: how can we ensure that superintelligent AI doesn’t terminate us?“, Roko Mijic referred to the plot of the classic 1956 science fiction film “Forbidden Planet“.
The film presents a mystery about events at a planet, Altair IV, situated 16 light years from Earth:
- What force had destroyed nearly every member of a previous spacecraft visiting that planet?
- And what force had caused the Krell – the original inhabitants of Altair IV – to be killed overnight, whilst at the peak of their technological powers?
A 1950’s film might be expected to point a finger of blame at nuclear weapons, or other weapons of mass destruction. However, the problem turned out to be more subtle. The Krell had created a machine that magnified the power of their own thinking, and acted on that thinking. So the Krells all became even more intelligent and more effective than before. You may wonder, what’s the problem with that?
A 2002 Steven B. Harris article in the Skeptic magazine, “The return of the Krell Machine: Nanotechnology, the Singularity, and the Empty Planet Syndrome“, takes up the explanation, quoting from the film. The Krell had created:
a big machine, 8000 cubic miles of klystron relays, enough power for a whole population of creative geniuses, operated by remote control – operated by the electromagnetic impulses of individual Krell brains… In return, that machine would instantaneously project solid matter to any point on the planet. In any shape or color they might imagine. For any purpose…! Creation by pure thought!
But … the Krell forgot one deadly danger – their own subconscious hate and lust for destruction!
And so, those mindless beasts of the subconscious had access to a machine that could never be shut down! The secret devil of every soul on the planet, all set free at once, to loot and maim! And take revenge… and kill!
Researchers at the Singularity Institute for Artificial Intelligence (SIAI) – including Roko – give a lot of thought to the general issue of unintended consequences of amplifying human intelligence. Here are two ways in which this amplification could go disastrously wrong:
- As in the Forbidden Planet scenario, this amplification could unexpectedly magnify feelings of ill-will and negativity – feelings which humans sometimes manage to suppress, but which can still exert strong influence from time to time;
- The amplication could magnify principles that generally work well in the usual context of human thought, but which can have bad consequences when taken to extremes.
As an example of the second kind, consider the general principle that a free market economy of individuals and companies who pursue an enlightened self-interest, frequently produces goods that improve overall quality of life (in addition to generating income and profits). However, magnifying this principle is likely to result in occasional disastrous economic crashes. A system of computers that were programmed to maximise income and profits for their owners could, therefore, end up destroying the economy. (This example is taken from the book “Beyond AI: Creating the Conscience of the Machine” by J. Storrs Hall. See here for my comments on other ideas from that book.)
Another example of the second kind: a young, fast-rising leader within an organisation may be given more and more responsibility, on account of his or her brilliance, only for that brilliance to subsequently push the organisation towards failure if the general “corporate wisdom” is increasingly neglected. Likewise, there is the risk of a new supercomputer impressing human observers (politicians, scientists, and philosophers alike, amongst others) by the brilliance of its initial recommendations for changes in the structure of human society. But if operating safeguards are removed (or disabled – perhaps at the instigation of the supercomputer itself) we could find that the machine’s apparent brilliance results in disastrously bad decisions in unforeseen circumstances. (Hmm, I can imagine various writers calling for the “deregulation of the supercomputer”, in order to increase the income and profit it generates – similar to the way that many people nowadays are still resisting any regulation of the global financial system.)
That’s an argument for being very careful to avoid abdicating human responsibility for the oversight and operation of computers. Even if we think we have programmed these systems to observe and apply human values, we can’t be sure of the consequences when these systems gain more and more power.
However, as our computer systems increase their speed and sophistication, it’s likely to prove harder and harder for comparatively slow-brained humans to be able to continue meaningfully cross-checking and monitoring the arguments raised by the computer systems in favour of specific actions. It’s akin to humans trying to teach apes calculus, in order to gain approval from apes for how much thrust to apply in a rocket missile system targeting a rapidly approaching earth-threatening meteorite. The computers may well decide that there’s no time to try to teach us humans the deeply complex theory that justifies whatever urgent decision they want to take.
And that’s a statement of the deep difficulty facing any “Friendly AI” program.
There are, roughly speaking, five possible ways people can react to this kind of argument.
The first response is denial – people say that there’s no way that computers will reach the level of general human intelligence within the foreseeable future. In other words, this whole discussion is seen as being a fantasy. However, it comes down to a question of probability. Suppose you’re told that there’s a 10% chance that the airplane you’re about to board will explode high in the sky, with you in it. 10% isn’t a high probability, but since the outcome is so drastic, you would probably decide this is a risk you need to avoid. Even if there’s only a 1% chance of the emergence of computers with human-level intelligence in (say) the next 20 years, it’s something that deserves serious further analysis.
The second response is to seek to stop all research into AI, by appeal to a general “precautionary principle” or similar. This response is driven by fear. However, any such ban would need to apply worldwide, and would surely be difficult to police. It’s too hard to draw the boundary between “safe computer science” and “potentially unsafe computer science” (the latter being research that could increase the probability of the emergence of computers with human-level intelligence).
The third response is to try harder to design the right “human values” into advanced computer systems. However, as Roko argued in his presentation, there is enormous scope for debating what these right values are. After all, society has been arguing over human values since the beginning of recorded history. Existing moral codes probably all have greater or lesser degrees of internal tension or contradiction. In this context, the idea of “Coherent Extrapolated Volition” has been proposed:
Our coherent extrapolated volition is our choices and the actions we would collectively take if we knew more, thought faster, were more the people we wished we were, and had grown up closer together.
Eliezer Yudkowsky believes a Friendly AI should initially seek to determine the coherent extrapolated volition of humanity, with which it can then alter its goals accordingly. Many other researchers believe, however, that the collective will of humanity will not converge to a single coherent set of goals even if “we knew more, thought faster, were more the people we wished we were, and had grown up closer together.”
A fourth response is to adopt emulation rather than design as the key principle for obtaining computers with human-level intelligence. This involves the idea of “whole brain emulation” (WBE), with a low-level copy of a human brain. The idea is sometimes also called “uploads” since the consciousness of the human brain may end up being uploaded onto the silicon emulation.
Oxford philosopher Anders Sandberg reports on his blog how a group of Singularity researchers reached a joint conclusion, at a workshop in October following the Singularity Summit, that WBE was a safer route to follow than designing AGI (Artificial General Intelligence):
During the workshop afterwards we discussed a wide range of topics. Some of the major issues were: what are the limiting factors of intelligence explosions? What are the factual grounds for disagreeing about whether the singularity may be local (self-improving AI program in a cellar) or global (self-improving global economy)? Will uploads or AGI come first? Can we do anything to influence this?
One surprising discovery was that we largely agreed that a singularity due to emulated people… has a better chance given current knowledge than AGI of being human-friendly. After all, it is based on emulated humans and is likely to be a broad institutional and economic transition. So until we think we have a perfect friendliness theory we should support WBE – because we could not reach any useful consensus on whether AGI or WBE would come first. WBE has a somewhat measurable timescale, while AGI might crop up at any time. There are feedbacks between them, making it likely that if both happens it will be closely together, but no drivers seem to be strong enough to really push one further into the future. This means that we ought to push for WBE, but work hard on friendly AGI just in case…
However, it seems to me that the above “Forbidden Planet” argument identifies a worry with this kind of approach. Even an apparently mild and deeply humane person might be playing host to “secret devils” – “their own subconscious hate and lust for destruction”. Once the emulated brain starts running on more powerful hardware, goodness knows what these “secret devils” might do.
In view of the drawbacks of each of these four responses, I end by suggesting a fifth. Rather than pursing an artificial intelligence which would run separately from a human intelligence, we should explore the creation of hybrid intelligence. Such a system involves making humans smarter at the same time as the computer systems become smarter. The primary source for this increased human smartness is closer links with the ever-improving computer systems.
In other words, rather than just talking about AI – Artificial Intelligence – we should be pursuing IA – Intelligence Augmentation.
For a fascinating hint about the benefits of hybrid AI, consider the following extract from a recent article by former world chess champion Garry Kasparov:
In chess, as in so many things, what computers are good at is where humans are weak, and vice versa. This gave me an idea for an experiment. What if instead of human versus machine we played as partners? My brainchild saw the light of day in a match in 1998 in León, Spain, and we called it “Advanced Chess.” Each player had a PC at hand running the chess software of his choice during the game. The idea was to create the highest level of chess ever played, a synthesis of the best of man and machine.
Although I had prepared for the unusual format, my match against the Bulgarian Veselin Topalov, until recently the world’s number one ranked player, was full of strange sensations. Having a computer program available during play was as disturbing as it was exciting. And being able to access a database of a few million games meant that we didn’t have to strain our memories nearly as much in the opening, whose possibilities have been thoroughly catalogued over the years. But since we both had equal access to the same database, the advantage still came down to creating a new idea at some point…
Even more notable was how the advanced chess experiment continued. In 2005, the online chess-playing site Playchess.com hosted what it called a “freestyle” chess tournament in which anyone could compete in teams with other players or computers. Normally, “anti-cheating” algorithms are employed by online sites to prevent, or at least discourage, players from cheating with computer assistance. (I wonder if these detection algorithms, which employ diagnostic analysis of moves and calculate probabilities, are any less “intelligent” than the playing programs they detect.)
Lured by the substantial prize money, several groups of strong grandmasters working with several computers at the same time entered the competition. At first, the results seemed predictable. The teams of human plus machine dominated even the strongest computers. The chess machine Hydra, which is a chess-specific supercomputer like Deep Blue, was no match for a strong human player using a relatively weak laptop. Human strategic guidance combined with the tactical acuity of a computer was overwhelming.
The surprise came at the conclusion of the event. The winner was revealed to be not a grandmaster with a state-of-the-art PC but a pair of amateur American chess players using three computers at the same time. Their skill at manipulating and “coaching” their computers to look very deeply into positions effectively counteracted the superior chess understanding of their grandmaster opponents and the greater computational power of other participants. Weak human + machine + better process was superior to a strong computer alone and, more remarkably, superior to a strong human + machine + inferior process.
The terminology “Hybrid Intelligence” was used in a recent presentation at the University of Washington by Google’s VP of Research & Special Initiatives, Alfred Z. Spector. My thanks to John Pagonis for sending me a link to a blog post by Greg Linden which in turn provided commentary on Al Spector’s talk:
What was unusual about Al’s talk was his focus on cooperation between computers and humans to allow both to solve harder problems than they might be able to otherwise.
Starting at 8:30 in the talk, Al describes this as a “virtuous cycle” of improvement using people’s interactions with an application, allowing optimizations and features like like learning to rank, personalization, and recommendations that might not be possible otherwise.
Later, around 33:20, he elaborates, saying we need “hybrid, not artificial, intelligence.” Al explains, “It sure seems a lot easier … when computers aren’t trying to replace people but to help us in what we do. Seems like an easier problem …. [to] extend the capabilities of people.”
Al goes on to say the most progress on very challenging problems (e.g. image recognition, voice-to-text, personalized education) will come from combining several independent, massive data sets with a feedback loop from people interacting with the system. It is an “increasingly fluid partnership between people and computation” that will help both solve problems neither could solve on their own.
I’ve got more to say about Al Spector’s talk – but I’ll save that for another day.