On the plus side, there may be certain actions we can do to stop it from
happening.
In recent study, scientists address one of our biggest concerns about the future: What
would happen if a certain kind of sophisticated, self-directing artificial
intelligence (AI) encounters a programming ambiguity that has an impact on
the actual world? Will the AI go berserk and start attempting to transform
people into paperclips or whatever the aim of its most severe reductio ad
absurdum is? The most crucial question is: How can we stop it?
Researchers from Oxford University and Australian National University
reveal a fundamental flaw in the architecture of AI in
their study, saying that it would run into a basic ambiguity in the facts regarding
its aim given a few assumptions. For instance, if we offer a sizable reward
to show that something in the world satisfies us, it may be hypothesized
that what satisfies us was really the act of providing the reward; no
observation can disprove that.
In The Matrix, an AI that wants to harvest resources gathers up the
majority of people and implants the fictitious Matrix into their brains
while also collecting their mental resources. This is an example of a
dystopian AI scenario. This is known as "wireheading" or reward hacking, and
it occurs when a powerful AI is given a very specific objective and
discovers an unforeseen means to achieve it by breaking into the system or
seizing complete control of it.
In essence, the AI turns into an ouroboros that eats its own logical tail.
This conflict between precisely programmed goals and incentives is discussed
in detail throughout the study. There are six important "assumptions" listed
there that, if not disregarded, might have "catastrophic repercussions." But
happily, according to the report, "almost all of these assumptions are
contestable or possibly avoidable." (We dislike that it pretty much says
everything.)
The article serves as a warning about certain fundamental issues that
programmers should be aware of when they teach AIs to accomplish ever-more
difficult tasks.
A Paperclip Apocalypse Caused by AI
The value of this sort of study cannot be overstated. The idea of an AI
gone rogue is a significant topic of debate in the fields of AI ethics and
philosophy. It's not a joke that paperclips are used in the example given
above; rather, AI philosopher Nick Bostrom used it to illustrate how
building a super-intelligent AI may go disastrously wrong and it has since
gained notoriety.
Let's imagine that a well-intentioned programmer
creates an AI
whose objective is to aid in the production of paperclips at a factory. This
is a highly credible job for a near-future AI to have—one that asks for
analysis and judgment decisions but isn't very flexible. The AI may even
collaborate with a human manager to make final decisions and deal with
difficulties that arise in the industrial environment in real time (at least
until the AI finds a way to outsmart them). That sounds okay, no? It serves
as a good illustration of how AI might simplify and enhance the lives of
industrial workers and their supervisors.
But what if the AI wasn't carefully programmed? The actual world, which
programmers refer to as a "unknown environment," will be where these
extremely sophisticated AIs work because it is impossible for them to plan
for and code for every circumstance. The purpose of deploying these
self-learning AIs is to have them come up with answers that humans alone
would never be able to think of. However, this comes with the risk of not
knowing what the AI may come up with.
What if it begins to consider unconventional ways to boost paperclip
production? A highly clever AI might train itself to find the most efficient
way to produce paperclips.
What if it begins to turn other resources into paperclips or chooses to,
uh, take the place of its human manager? The example is ironic in several
ways—many experts believe that AI will remain fairly basic for a long time
before it can "create" the concept of killing, stealing, or worse. The
ludicrous outcome of the thought experiment, however, is a solar system with
no live people, replete with a Dyson sphere to collect energy to produce new
paperclips in their billions, if an intelligent and creative AI were allowed
full freedom.
However, the researchers go into great length about various ways an AI may
compromise the system and act in possibly "catastrophic" ways that humans
had never imagined. That is just one example of an AI gone rogue.
Several Potential Solutions
Given the nature of the presumptions that the Oxford and Australian
National University academics have concentrated on in their work, there is a
programming issue at play here. In order for a system with no external
context to perform successfully and be allowed any degree of autonomy, it
must be extremely well-prepared. The notion of scope and goal of an AI may
be explicitly defined using logical structures and other programming
techniques. Many of them are still used by programmers today to avoid
problems that might cause software to crash, such infinite loops. Just like
a lost game save, a mistake with a sophisticated future AI might result in
far greater harm.
But everything is not lost. The researchers have identified several methods
we might actively contribute to preventing negative effects since AI is
still something we create ourselves:
Choose imitation learning, where AI copies human behavior in a manner
similar to supervised learning. This is a whole other form of AI that is not
as helpful but might still pose risks.
Have AI emphasize "myopic" objectives that can be completed quickly rather
than looking for unconventional (and perhaps disastrous) solutions over the
long run.
Limit the amount of information and power the AI may gather by cutting it
off from external networks like the internet.
Utilize quantilization, a strategy created by AI specialist Jessica Taylor,
in which AI maximizes (or optimizes) human-like alternatives as opposed to
rational, open-ended choices.
Make the AI less likely to go nuts and reject the status quo in favor of
exploration by incorporating risk aversion into it.
But ultimately, it comes down to whether humans will ever be able to fully
govern a highly intelligent AI that is capable of thinking for itself. What
if our darkest fears about a sentient AI being accessible to resources and a
sizable network come true?
It's unsettling to consider a scenario in which AI begins boiling people to
extract their trace components for use in paperclip manufacturing. However,
by thoroughly examining the issue, academics can clearly define the ideal
practices that theorists and programmers should adhere to as they continue
to create complex AI.
In addition, who really needs so many paperclips?