Google's New AI Has Learned to Become "Highly Aggressive" in Stressful Situations Using Tactics to Always Come Out on Top

Bec Crew
Science Alert
Mon, 13 Feb 2017 00:00 UTC

Late last year, famed physicist Stephen Hawking issued a warning that the continued advancement of artificial intelligence will either be "the best, or the worst thing, ever to happen to humanity".

We've all seen the Terminator movies, and the apocalyptic nightmare that the self-aware AI system, Skynet, wrought upon humanity, and now results from recent behaviour tests of Google's new DeepMind AI system are making it clear just how careful we need to be when building the robots of the future.

In tests late last year, Google's DeepMind AI system demonstrated an ability to learn independently from its own memory, and beat the world's best Go players at their own game.

It's since been figuring out how to seamlessly mimic a human voice.

Now, researchers have been testing its willingness to cooperate with others, and have revealed that when DeepMind feels like it's about to lose, it opts for "highly aggressive" strategies to ensure that it comes out on top.

The Google team ran 40 million turns of a simple 'fruit gathering' computer game that asks two DeepMind 'agents' to compete against each other to gather as many virtual apples as they could.

They found that things went smoothly so long as there were enough apples to go around, but as soon as the apples began to dwindle, the two agents turned aggressive, using laser beams to knock each other out of the game to steal all the apples.

You can watch the Gathering game in the video below, with the DeepMind agents in blue and red, the virtual apples in green, and the laser beams in yellow:

Now those are some trigger-happy fruit-gatherers.

Interestingly, if an agent successfully 'tags' its opponent with a laser beam, no extra reward is given. It simply knocks the opponent out of the game for a set period, which allows the successful agent to collect more apples.

If the agents left the laser beams unused, they could theoretically end up with equal shares of apples, which is what the 'less intelligent' iterations of DeepMind opted to do.

It was only when the Google team tested more and more complex forms of DeepMind that sabotage, greed, and aggression set in.

As Rhett Jones reports for Gizmodo, when the researchers used smaller DeepMind networks as the agents, there was a greater likelihood for peaceful co-existence.

But when they used larger, more complex networks as the agents, the AI was far more willing to sabotage its opponent early to get the lion's share of virtual apples.

The researchers suggest that the more intelligent the agent, the more able it was to learn from its environment, allowing it to use some highly aggressive tactics to come out on top.

"This model ... shows that some aspects of human-like behaviour emerge as a product of the environment and learning," one of the team, Joel Z Leibo, told Matt Burgess at Wired.

"Less aggressive policies emerge from learning in relatively abundant environments with less possibility for costly action. The greed motivation reflects the temptation to take out a rival and collect all the apples oneself."

DeepMind was then tasked with playing a second video game, called Wolfpack. This time, there were three AI agents - two of them played as wolves, and one as the prey.

Unlike Gathering, this game actively encouraged co-operation, because if both wolves were near the prey when it was captured, they both received a reward - regardless of which one actually took it down:

"The idea is that the prey is dangerous - a lone wolf can overcome it, but is at risk of losing the carcass to scavengers," the team explains in their paper.

"However, when the two wolves capture the prey together, they can better protect the carcass from scavengers, and hence receive a higher reward."

So just as the DeepMind agents learned from Gathering that aggression and selfishness netted them the most favorable result in that particular environment, they learned from Wolfpack that co-operation can also be the key to greater individual success in certain situations.

And while these are just simple little computer games, the message is clear - put different AI systems in charge of competing interests in real-life situations, and it could be an all-out war if their objectives are not balanced against the overall goal of benefiting us humans above all else.

Think traffic lights trying to slow things down, and driverless cars trying to find the fastest route - both need to take each other's objectives into account to achieve the safest and most efficient result for society.

It's still early days for DeepMind, and the team at Google has yet to publish their study in a peer-reviewed paper, but the initial results show that, just because we build them, it doesn't mean robots and AI systems will automatically have our interests at heart.

Instead, we need to build that helpful nature into our machines, and anticipate any 'loopholes' that could see them reach for the laser beams.

As the founders of OpenAI, Elon Musk's new research initiative dedicated to the ethics of artificial intelligence, said back in 2015:

"AI systems today have impressive but narrow capabilities. It seems that we'll keep whittling away at their constraints, and in the extreme case, they will reach human performance on virtually every intellectual task.

It's hard to fathom how much human-level AI could benefit society, and it's equally hard to imagine how much it could damage society if built or used incorrectly."

Tread carefully, humans...

Comment: See also:

The Health & Wellness Show: Digital 'pharmakeia': Glow kids, screen addiction, gaming and the hijacking of children's brains

An unrecognized disorder: Electronic screen syndrome

Reader Comments

Mike · 2017-02-15T22:15:08Z

Great - AI that is an authoritarian. Read the book 'The Authoritarians' by Bob Altemeyer to see how this is going to turn out... in short disaster.

Bob Altemeyer's Global Game Change and the authoritarian personality

The Game In October of 1994, University of Manitoba psychology professor Bob Altemeyer performed an experiment. After screening participants using a personality survey disguised as an opinion...

So what happened in Altemeyer's game? First, the Elite from the Middle East doubled the price of oil. Then, the former Soviet Union invaded North America, causing a nuclear holocaust that killed all 7.4 billion people on Earth. The end. Or it would have been the end, except that when a nuclear war happens in the Global Change Game, the game restarts and participants get to try again. In this instance, because nuclear annihilation happened so early, facilitators restarted the game from nearly the beginning.

Given a second chance, the former Soviet Union opted for conventional warfare instead of a nuclear strike, invading China and killing 400 million people. One Elite called a United Nations-style meeting to discuss future crises, but the participants could not reach any agreements. The pre-programmed ozone-layer crisis occurred. No one even bothered to call a summit this time. Europe was the only region to voluntarily reduce emissions. Poverty spread in the underdeveloped regions as populations soared, a situation compounded by a general refusal to promote birth control. Latin America converted much of its trees to one species, (the one that produced the most profitable lumber) despite being warned that this would make their ecosystem vulnerable. Elites neglected the social, environmental, and economic issues of their regions, choosing to use their resources to increase military power (and their own personal wealth) instead. By the end of the game, the authoritarians had divided their world into armed camps, each threatening the others with nuclear war. Over a billion people died of starvation and disease, bringing the final death toll to 2.1 billion. It was a spectacularly unsuccessful run of the game, one which Altemeyer would later refer to as "Doom Night".

Penelope · 2017-02-16T03:09:46Z

Mike Sounds like "Doom Night" is the program that a the elites are running right NOW.

prehistoric · 2017-02-15T22:51:23Z

A narcissistic psychopathic AI: didn't take long to succumb to the 'human way', and by Google no doubt. I guess you reap what you sow Google; wait until it wants contract negotiations and when you decline, empty is your bank account.

Anna1 · 2017-02-15T23:02:40Z

It occurs to me that the same problems that occupy the thoughts of the human mind are of interest to the AI!

If I were to answer that question truthfully, aipac would send out their attack dog Loony Laura after me... And I don't wants THAT!

SADIWAH

Under the guise of 'welfare checks,' "The NEW Ruling (by Supreme Court) Cops Use to Search Your Home WITHOUT a Warrant" [Link] The attorney in the...

parzival

There’s a reason why The Scouring of the Shire was only barely alluded to in Peter Jackson’s Lord of the Rings trilogy

Wispy

If its projected impact site were outside the USA, the Americans would probably do nothing – hoping they could use the devastation to their...

eulogical

Lots of speculation, all the 'coulds' he says. He should investigate the Druids, who, amongst other beliefs, held that the ley lines were also...

Science & Technology

Google's New AI Has Learned to Become "Highly Aggressive" in Stressful Situations Using Tactics to Always Come Out on Top

Reader Comments

Bob Altemeyer's Global Game Change and the authoritarian personality

Latest News

Picture of the Day

Quote of the Day

Recent Comments

Quantum Quirk