• The hardware for a new gesture-based computing system consists of nothing more than an ordinary webcam and a pair of brightly colored lycra gloves.

    Photo: Jason Dorfman/CSAIL

    Full Screen
  • CSAIL graduate student Robert Wang shows off the new system, which he developed together with associate professor of electrical engineering and computer science Jovan Popović.

    Photo: Jason Dorfman/CSAIL

    Full Screen

Gesture-based computing on the cheap

With a single piece of inexpensive hardware — a multicolored glove — MIT researchers are making Minority Report-style interfaces more accessible.


Ever since Steven Spielberg’s 2002 sci-fi movie Minority Report, in which a black-clad Tom Cruise stands in front of a transparent screen manipulating a host of video images simply by waving his hands, the idea of gesture-based computer interfaces has captured the imagination of technophiles. Academic and industry labs have developed a host of prototype gesture interfaces, ranging from room-sized systems with multiple cameras to detectors built into laptops’ screens. But MIT researchers have developed a system that could make gestural interfaces much more practical. Aside from a standard webcam, like those found in many new computers, the system uses only a single piece of hardware: a multicolored Lycra glove that could be manufactured for about a dollar.

Other prototypes of low-cost gestural interfaces have used reflective or colored tape attached to the fingertips, but “that’s 2-D information,” says Robert Wang, a graduate student in the Computer Science and Artificial Intelligence Laboratory who developed the new system together with Jovan Popović, an associate professor of electrical engineering and computer science. “You’re only getting the fingertips; you don’t even know which fingertip [the tape] is corresponding to.” Wang and Popović’s system, by contrast, can translate gestures made with a gloved hand into the corresponding gestures of a 3-D model of the hand on screen, with almost no lag time. “This actually gets the 3-D configuration of your hand and your fingers,” Wang says. “We get how your fingers are flexing.”

The most obvious application of the technology, Wang says, would be in video games: Gamers navigating a virtual world could pick up and wield objects simply by using hand gestures. But Wang also imagines that engineers and designers could use the system to more easily and intuitively manipulate 3-D models of commercial products or large civic structures.

Robert Wang demonstrates the speed and precision with which the system can gauge hand position in three dimensions — including the flexing of individual fingers — as well as a possible application in mechanical engineering.
Video: Robert Y. Wang/Jovan Popović

Patchwork approach

The glove went through a series of designs, with dots and patches of different shapes and colors, but the current version is covered with 20 irregularly shaped patches that use 10 different colors. The number of colors had to be restricted so that the system could reliably distinguish the colors from each other, and from those of background objects, under a range of different lighting conditions. The arrangement and shapes of the patches was chosen so that the front and back of the hand would be distinct but also so that collisions of similar-colored patches would be rare. For instance, Wang explains, the colors on the tips of the fingers could be repeated on the back of the hand, but not on the front, since the fingers would frequently be flexing and closing in front of the palm.

Technically, the other key to the system is a new algorithm for rapidly looking up visual data in a database, which Wang says was inspired by the recent work of Antonio Torralba, the Esther and Harold E. Edgerton Associate Professor of Electrical Engineering and Computer Science in MIT’s Department of Electrical Engineering and Computer Science and a member of CSAIL. Once a webcam has captured an image of the glove, Wang’s software crops out the background, so that the glove alone is superimposed upon a white background. Then the software drastically reduces the resolution of the cropped image, to only 40 pixels by 40 pixels. Finally, it searches through a database containing myriad 40-by-40 digital models of a hand, clad in the distinctive glove, in a range of different positions. Once it’s found a match, it simply looks up the corresponding hand position. Since the system doesn’t have to calculate the relative positions of the fingers, palm, and back of the hand on the fly, it’s able to provide an answer in a fraction of a second.

Of course, a database of 40-by-40 color images takes up a large amount of memory — several hundred megabytes, Wang says. But today, a run-of-the-mill desktop computer has four gigabytes — or 4,000 megabytes — of high-speed RAM memory. And that number is only going to increase, Wang says.

Changing the game

“People have tried to do hand tracking in the past,” says Paul Kry, an assistant professor at the McGill University School of Computer Science. “It’s a horribly complex problem. I can’t say that there’s any work in purely vision-based hand tracking that stands out as being successful, although many people have tried. It’s sort of changing the game a bit to say, ‘Hey, okay, I’ll just add a little bit of information’” — the color of the patches — “‘and I can go a lot farther than these purely vision-based techniques.’” Kry particularly likes the ease with which Wang and Popović’s system can be calibrated to new users. Since the glove is made from stretchy Lycra, it can change size significantly from one user to the next; but in order to gauge the glove’s distance from the camera, the system has to have a good sense of its size. To calibrate the system, the user simply places an 8.5-by-11-inch piece of paper on a flat surface in front of the webcam, presses his or her hand against it, and in about three seconds, the system is calibrated.

Wang initially presented the glove-tracking system at last year’s Siggraph, the premier conference on computer graphics. But at the time, he says, the system took nearly a half-hour to calibrate, and it didn’t work nearly as well in environments with a lot of light. Now that the glove tracking is working well, however, he’s expanding on the idea, with the design of similarly patterned shirts that can be used to capture information about whole-body motion. Such systems are already commonly used to evaluate athletes’ form or to convert actors’ live performances into digital animations, but a system based on Wang and Popović’s technique could prove dramatically cheaper and easier to use.


Topics: Computer Science and Artificial Intelligence Laboratory (CSAIL), Computer science and technology, Computer vision, Electrical engineering and electronics, Innovation and Entrepreneurship (I&E), Motion sensing, Gestural interfaces

Comments

Nice job, Robert and Jovan! This is exactly the type of technology that people need to interact with their PCs - not the fancy touchscreens or voice control! Keep up good work!
Congratulations on a wonderful idea and execution. Our startup uses the opensource version of Second Life called Opensimulator.org to develop educational applications for disadvantaged children. Last year we received seed funding from Social Entrepreneurs Ireland. I would love to know if your glove based system could be adapted to work with Opensim (ie. the Second Life viewer and opensource clones)? If the kids using our system could use their hands intuitively to build 3D models it would open up a whole new world to them.
I write with mixed feelings to congratulate your hard work and success. At the start of 2009 I developed a similar process (involving a webcam). But alas, I had no funding and only made patents. But still, in the training I sustained from my Berkeley education, I know when to recognize accomplishments in others even when it competes with mine. For in the end, it's to help society and people. ~Congratulations~
Now are you going to open source it? If not, let's get some licensees lined up, and quick! From that video demo it seems very nearly ready for prime time, and I for one am eager to see this in actual products.
The computer can feel our finger without touching it, awesome! Maybe i can use a tiny stretchy Lycra on my fingertip to do the job which the mouse do.
I can see a big-budget film made about this. The main character flexes her hand in a way the computer cannot comprehend and she breaks a link to the Matrix. Pudgypaw, what do you think is the biggest obstacle in obtaining funding for this technology (your's and Roberta and Jovan's)?
Hi thre, I am a developer of Second Life gadgets (for educators) http://b3dmultitech.com, Any chance you'll bring these into Second Life????
While investors want to maximize leverage on a good deal, the current times have dampened risk-taking mentality. Granted my work peaks interest in investors, the negotiations have been heart breaking and soul crushing as a recent college graduate. Most investors just want to hear the idea and run to someone else.
some way to follow this project thanks,or it's death?
Back to the top