Microsoft’s requires AI agents to cooperate in a virtual universe
If artificial intelligence (AI) agents are to become real players in society, using their machine abilities to complement our human strengths, they must first become players in the video game of Minecraft. And to prove themselves in Minecraft, they must work together to capture animals in a maze, build towers of blocks, and hunt for treasure while fighting off skeletons.
That, anyway, is the premise of a competition organized by Microsoft, Queen Mary University of London, and crowdAI (a platform for data-science challenges). Next month, the organizers will announce the winner—the team that created an AI that could best observe its Minecraft environment, determine which of three missions it had to accomplish, and then collaborate with another AI agent to carry out that mission.
By emphasizing adaptability and cooperation, the organizers aimed to encourage research on AI agents that could one day interact with humans to accomplish tasks in the real world. And while an AI that can truly match the intellectual capacity of a human is still the stuff of science fiction, researchers could take meaningful steps toward that goal of artificial general intelligence(AGI) in Minecraft.
The Multi-Agent Reinforcement Learning in MalmO (MARLO) competition is an offshoot from Project Malmo, begun in 2015 by AI researcher Katja Hofmann at Microsoft Research Cambridge, in England. Although much exciting AI research has involved AIs mastering strategy games like chess and Go, Hofmann was looking for a game that would allow an AI to learn a broader range of skills.
“The moment we started talking about Minecraft, it was obvious that this was a perfect environment for AI research,” she says. “It’s a world that people join with no predefined goal.” Project Malmo is a platform built on top of Minecraft in which researchers can perform many different kinds of AI experiments, yet also compare their results in a standardized way.
In the inaugural MARLO challenge, in 2017, AI agents were asked to carry out a single mission: catching a pig. For the 2018 competition, the organizers made it harder by designing three different missions that all require collaboration. The AI competitors had to learn how to identify another AI agent in the environment, and then find a way to work together toward their common goal.
An AI agent that could hypothesize about the goals of another agent would have a rudimentary form of what psychologists call “theory of mind,” the human capacity to attribute mental states and intentions to other people. Hofmann hopes that AI agents will eventually hone this ability by collaborating with human players in Minecraft. “Then the algorithms could learn to collaborate with humans,” she says, “and learn what humans want.”
The AIs at play in the MARLO competition were trained via reinforcement learning, in which an AI learns through an intense course of trial and error. Each team’s AI began by making random movements and observing their effects on the game. The competition environment had rewards built into the game, so the AI received points for certain achievements. Eventually, the AI figured out the actions that caused it to acquire points—and that resulted in a captured chicken or found treasure. While the reinforcement-learning algorithms did most of the work in these training sessions, each MARLO team had its own strategy to speed up or improve the learning.
Minecraft is just one of many complex video games now being used by AI researchers, says Igor Mordatch, who leads multiagent research at OpenAI, a nonprofit research organization based in San Francisco. OpenAI hasn’t focused on Minecraft, instead creating AI agents that can play the multiplayer video game Dota2.
“We’re building toward a good ecosystem of environments and benchmarks for reinforcement learning,” says Mordatch. “But there is a challenge now: How do we ensure that the AI is learning something useful in these games?”
Hofmann says the AI agents that succeed in Minecraft can likely succeed in other video games as well. It’s easy to imagine that AI could power the nonplayer characters in games, perhaps giving these characters the ability to interact naturally and cooperate with human players.
One competitor in the MARLO challenge sees even more practical applications for his entry. Donghun Lee, a researcher at the Electronics and Telecommunications Research Institute, in South Korea, focused on getting his AI agent to communicate effectively and express its intentions. He says this ability will feed directly into his IoT research. The proliferation of smart devices poses communication problems, Lee says, as many networked devices now need to work in tandem.
“It’s not an easy job to manipulate all these devices at once in a cloud,” he says. But with multiagent reinforcement learning, he says, the IoT devices could figure out how to work together.
This article appears in the January 2019 print issue as “Why AIs Are Chasing Chickens in Minecraft.”