An army of more than 4,000 marching doglike robots is a vaguely menacing sight, even in a simulation. But it may point the way for machines to learn new tricks.
The virtual robot army was developed by researchers from ETH Zurich in Switzerland and chipmaker Nvidia. They used the wandering bots to train an algorithm that was then used to control the legs of a real-world robot.
In the simulation, the machines—called ANYmals—confront challenges like slopes, steps, and steep drops in a virtual landscape. Each time a robot learned to navigate a challenge, the researchers presented a harder one, nudging the control algorithm to be more sophisticated.
From a distance, the resulting scenes resemble an army of ants wriggling across a large area. During training, the robots were able to master walking up and down stairs easily enough; more complex obstacles took longer. Tackling slopes proved particularly difficult, although some of the virtual robots learned how to slide down them.
When the resulting algorithm was transferred to a real version of ANYmal, a four-legged robot roughly the size of a large dog with sensors on its head and a detachable robot arm, it was able to navigate stairs and blocks but suffered problems at higher speeds. Researchers blamed inaccuracies in how its sensors perceive the real world compared to the simulation,
Similar kinds of robot learning could help machines learn all sorts of useful things, from sorting packages to sewing clothes and harvesting crops. The project also reflects the importance of simulation and custom computer chips for future progress in applied artificial intelligence.
“At a high level, very fast simulation is a really great thing to have,” says Pieter Abbeel, a professor at UC Berkeley and cofounder of Covariant, a company that is using AI and simulations to train robot arms to pick and sort objects for logistics firms. He says the Swiss and Nvidia researchers “got some nice speed-ups.”
AI has shown promise for training robots to do real-world tasks that cannot easily be written into software, or that require some sort of adaptation. The ability to grasp awkward, slippery, or unfamiliar objects, for instance, is not something that can be written into lines of code.
The 4,000 simulated robots were trained using reinforcement learning, an AI method inspired by research on how animals learn through positive and negative feedback. As the robots move their legs, an algorithm judges how this affects their ability to walk, and tweaks the control algorithms accordingly.
The simulations ran on specialized AI chips from Nvidia rather than general purpose chips used in computers and servers. As a result, the researchers say they were able to train the robots in less than one-hundredth the time that’s normally required.
Using the specialized chips also presented challenges. Nvidia’s chips excel at calculations that are crucial for rendering graphics and running neural networks, but they’re not well suited to simulating the properties of physics, like climbing and sliding. So researchers had to come up with some clever software workarounds, says Rev Lebaredian, Nvidia’s vice president of simulation technology. “It has taken us a long time to get it right,” he says.
Simulation, AI, and specialized chips have the potential to advance robotic intelligence. Nvidia has developed software tools that make it easier to simulate and control industrial robots using its chips. The company has also established a robotics research lab in Seattle. And it sells chips and software for use in self-driving vehicles.
Unity Technologies, which makes software for building 3D video games, has also branched into making software suitable for roboticists to use. Danny Lange, the company’s senior vice president for artificial intelligence, says Unity noticed how many researchers were using the company’s software to run simulations, so they made it more realistic and compatible with other robotics software. Unity is now working with Algoryx, a Swedish company that is testing whether reinforcement learning and simulation can train forestry robots to pick up logs.
Reinforcement learning has been around for decades but has produced some notable AI milestones recently, thanks to advances in other technology. In 2015, reinforcement learning was used to train a computer to play Go, a subtle and instinctive board game, with superhuman skill. It has more recently been put to practical uses, including automating aspects of chip design that require experience and judgment. The trouble is, learning this way requires a lot of time and data.
For instance, it took the company Open AI more than 14 days to train a robot hand to manipulate a Rubik’s Cube in crude ways with reinforcement learning, using numerous CPUs running together. Having to wait two weeks each time a robot was retrained might discourage companies from using the robot.
Early efforts at training robots with reinforcement learning split the process across several real-world robots. Improvements in the physics simulations have made it possible to accelerate learning in virtual environments.
The new work is “extremely exciting for end users,” says Andrew Spielberg, a student at MIT who has used similar simulation methods to devise new physical designs for robots. He notes that a research group at Google has done related work, speeding up robot learning by splitting it across one of the company’s custom Tensor Processing Unit chips.
Tully Foote, who manages the widely used open source Robot Operating System at the Open Robotics Foundation, says simulation is increasingly important for commercial users. “Validating software in realistic scenarios before deploying to hardware saves a lot of time and money,” he says. “It can run faster than real time, never breaks the robot, and can be reset automatically and instantly if there's an error.”
But Tully adds that transferring robot learning to the real world is a lot more challenging. “There's a lot more uncertainty in the real world,” he says. “Dirt, lighting, weather, hardware non-uniformity, wear and tear, all need to be tracked.”
Lebaredian at Nvidia says the kind of simulation used to train the walking robots may eventually influence the design of the algorithms involved too. “Virtual worlds are valuable for just about everything,” he says. “But definitely one of the most important ones is constructing playgrounds or training grounds for the AIs we want to create.”