AIs are starting to learn like human babies by grasping and poking objects

“Who knew collecting data for an AI would be so rewarding?”
“Who knew collecting data for an AI would be so rewarding?”
Image: Reuters/Jason Lee
By
We may earn a commission from links on this page.

Researchers at Carnegie Mellon University are teaching robots to learn by touch, much like human babies. The experiment one day could allow artificial intelligence to learn about the physical environment through senses, including touch. The development draws robotics and AI closer together, paving the way for potential unified applications in factories, automated deliveries of goods, or household assistants.

The research is set out in a paper by CMU students Lerrel Pinto, Dhiraj Gandhi, and Yuanfeng Han; and professors Yong-Lae Park and Abhinav Gupta. The paper, titled “The Curious Robot: Learning Visual Representations via Physical Interactions (pdf),” describes the experiment’s goal: to use physical robotic interactions to teach an AI to recognize objects.

Poking and pushing

The research is a departure from established AI practices, which often use an algorithm to instruct a robot to perform certain physical tasks. “Babies push objects, poke them, put them in their mouth and throw them to learn representations. Towards this goal, we build one of the first systems on a [robotic arm] that pushes, pokes, grasps and actively observes objects in a tabletop environment,” the authors write.

Instead of using so-called “passive observation” that relies on feeding AI lots of data, the CMU experiment wanted to use a robot (a Baxter robot, similar to the one pictured above) to actively explore its environment in order to collect data, which would then be used to train an AI. “While there has been significant work in the vision and robotics community to develop vision algorithms for performing robotic tasks such as grasping, to the best of our knowledge this is the first effort that reverses the cycle and uses robotic tasks for learning visual representations,” the authors write.

In order to do this, the researchers programmed a robotic arm to perform four gestures: grasping, pushing, poking (in order to record pressure on a “skin sensor”), and active vision, which involves seeing an object from multiple angles. The researchers let the arms interact with 100 unique objects, collecting 130,000 data points.

This data was then fed into a “convolutional neural network”–a kind of algorithm–to train the network to learn a visual representation of the objects it interacted with. The neural network, having been trained by the touch data, was able to more accurately classify images of the objects on the ImageNet research database than without the touch data.

Here’s a video of the CMU robotic arm pushing an object:

This technology could be deployed in a range of ways, from industrial uses to consumer robots in the home. Nathan Benaich, a venture capitalist at Playfair Capital in London who focuses on AI, sees potential applications on factory assembly lines, in warehouses for delivery and logistics, and as components of household robot assistants.

Though the CMU experiment is ambitious, there are some caveats. The robot is limited to a table-top environment, which means that only objects small enough to fit on a table-top were included. These included things such as atomizers, staplers, and plush toys. The research has also not yet been peer-reviewed (it’s available as a pre-print currently), although the team has submitted it to be presented at the European Conference on Computer Vision in October.

Google’s research division has devised a similar experiment with robotic arms, with the aim of teaching the robots “hand-eye coordination.” The Google robots can only perform one gesture, grasping, but the lab set up 14 arms to collect lots of grasping data, feeding that information into a convolutional neural network to determine the optimal way for robots to pick things up.

Unsupervised learning

Experiments like the CMU researchers’ and Googles’ are ambitious attempts to address two problems simultaneously: the data-collection problem, which is needed to train AIs; and improvements in ”unsupervised learning,”which allows AIs to make sense of any data without training. Instead of relying on manually assembled data-sets fed in by humans, the idea is to create robots that can obtain the data, and then feed them to AIs.

“The overall idea of robots learning from continuous interaction is a very powerful one,” said Sergey Levine, a professor at the University of Washington who works on AI projects with Google. ”Robots collect their own data, and in principle they can collect as much of it as they need, so learning from continuous active interaction has the potentially to enable tremendous progress in machine perception, autonomous decision making, and robotics.”

Robots put in the service of collecting data for AIs, then, could help artificial intelligences learn in a more human way.