July 14, 2017
Two teams of Princeton graduate students are making strong showings in national robotics competitions this year. The teams are combining advances in computation with those in sensing technology.
One group is joining with teammates at the Massachusetts Institute of Technology later this month for the third annual Amazon Robotics Challenge, in Nagoya, Japan. The challenge asks teams to develop a robot that can recognize various objects that it has never seen before, pick them up, and pack them in a box. The team finished third overall in last year's challenge. The second team is a finalist in Amazon's ongoing Alexa competition, which challenges teams to create software that converses naturally with people.
The Alexa team is plumbing methods to understand and work with language, while the Robotics Challenge team is pushing the boundaries of computer vision and image processing.
"I'm excited to see Princeton teams leading in robotics competitions," said Jennifer Rexford, the Gordon Y. S. Wu Professor in Engineering and chair of the Department of Computer Science. "The transition of robotics from controlled environments, like factories, to the complex human world brings tremendous opportunity to serve society, while also raising difficult technical challenges. Princeton is tackling these challenges by bringing advances in sensing and computation together to allow future robots to understand the world around them and interact safely in human society.”
A sharp eye and a delicate touch
Asked about the challenge of creating accurate robotic vision, Andy Zeng and Shuran Song produced a small black metal-mesh basket. It was an everyday object, the kind of thing used to hold pencils on a desk. But as Zeng slowly spun the small basket in this hand, he said "It may not seem like it, but this is actually one of the more challenging objects to handle.
"This is because the black reflective surface makes it hard for 3D sensors to see," Zeng added. "Such surfaces appear often in our everyday environments, and but are less addressed in recent computer vision research. By using various objects with special properties like this one, the competition forces us to tackle challenging vision problems for real-world scenarios."
In a video the team made last year, viewers can see a robot painted Princeton orange look into a red container to identify an object using a sensor on the robotic arm. After determining it is a coffee can, the robot cranes down into the crate, uses high-powered suction cups to pick up the can, then lifts its arm upward and places it into a shelf.
Princeton’s team is led by Zeng and Song, both graduate students in Computer Science, and their faculty adviser is Thomas Funkhouser. While the Princeton students work on building an algorithm for identifying objects, the MIT students are working on “manipulation,” or using the robotic arm and hand to grasp and move the objects.
This will be the second year the team has taken part in the challenge, which was originally known as the Amazon Picking Challenge. The students said that although the algorithm they built last year was accurate, the two teams that finished ahead of them managed to build a robot that worked more quickly. “Our biggest weakness was speed,” Song said.
They are working to solve that problem by adding more cameras, in addition to the camera on the robotic arm. With that change, along with an improved algorithm, hopefully “we can obtain a speedup similar to the teams that performed well last year,” Zeng said.
Last year, the team was given a list of objects in advance that the robot might be asked to identify, grasp and move. But this year , the challenge will be harder: Participants will only have that information a half hour in advance. That has required them to create an algorithm that is more versatile, the students said.
“We have to adapt our algorithm to be versatile enough so that it can still recognize these new objects” with less time, Zeng said. “We’re working on the algorithms to solve that.”
A better voice
Since childhood, Cyril Zhang has dreamed of simulating consciousness in a machine. Although scientists still debate whether creating a truly conscious machine is even possible, for now the second-year doctoral student in computer science is working on something that is "perhaps as close as you can get."
Beginning in early October, Zhang and twelve other graduate students in computer science began working on a software system designed to converse coherently and engagingly with humans on a variety of popular topics. The effort is part of Amazon's Alexa Prize competition, which requires international teams of university students to create software that can carry such conversation for at least twenty minutes using Alexa, Amazon’s voice service, as a starting point. The winning team will receive a $500,000 award as well as a $1 million research grant for their department.
In November 2016, Amazon notified the Princeton team that it was one of twelve university groups chosen to be sponsored to participate in the competition, which meant they would receive a $100,000 stipend, and support for their efforts. Since then, team members have been meeting every Thursday with their faculty adviser Professor Sanjeev Arora, to coordinate their respective tasks.
Software that emulates human behavior is known as a socialbot. The Princeton team chose to name its socialbot “Pixie,” as the most concise combination of “Princeton” and “Alexa” that they could devise, according to Daniel Suo, a member of the team.
The team grew out of a weekly graduate student reading group on the topic of deep learning. Zhang described the team as “outsiders” since most of the students do not specialize in natural language processing research, but have experience in other fields such as machine learning theory, deep learning, computer vision, robotics, or distributed systems.
“Our simultaneous strength and weakness is that we come from a variety of research backgrounds,” Zhang said. “What that means is that I’m optimistic we can come up with something that may never have occurred to someone who has spent a long time in the natural language processing field. But at the same time we are definitely spending a lot of effort getting oriented to techniques that researchers in the field are already completely comfortable with.”
Emulating human conversation has long been a challenge in software design. Humans communicate in ambiguous terms, and correctly interpreting words and sentences depends on context, common sense, and some understanding of the world. Because computers lack such prior knowledge and rely on precision, programming computers to make sense of ambiguities is extremely difficult.
“For any particular input, the bot has to determine – is the user trying to talk about a specific topic? Is it more just general chitchat? Which sources might be needed for generating a suitable response?” said Ari Seff, a second-year graduate student in computer science.
The team also faces the broader challenge of designing a coherent personality that will entertain the user and keep the conversation going in a natural, non-disjointed way.
“The Amazon competition challenges us to think about conversation from a social perspective,” Suo said. “It would get boring to talk to a bot that just told endless one-liners or just answered fact-based questions. But what about language cues indicating how interested or bored someone is? Can we guide the conversation to a new area rather than just react to the user?”
Members of the team expressed excitement at the collaborative nature of the project and the possibility of new and disruptive ideas growing from it.
“It's an opportunity for us to build something together, but to also learn from each other,” Suo said.
In April , entrant teams received feedback from real-life Amazon Echo users on the success of their socialbot based on the relevance, coherence, interest and speed of the conversation. The final prize winners will be announced in November 2017.
“If we win or if we don’t win is not the point,” team member Davit Buniatyan said. “The fact is that this research is advancing the future of machine learning.”
Pixie Team Members: Alex Beatson, Ari Seff, Cyril Zhang, Daniel Suo, Davit Buniatyan, Holden Lee, Jason Ge, Karan Singh, Kiran Vodrahalli, Misha Khodak, Nikunj Saunshi, Niranjani Prasad, and Oluwatosin Adewale.