• Photo: Bryce Vickmark

    Full Screen
  • MIT students (left to right) Ayush Bhandari, Refael Whyte and Achuta Kadambi pose next to their "nano-camera" that can capture translucent objects, such as a glass vase, in 3-D.

    Photo: Bryce Vickmark

    Full Screen
  • Ramesh Raskar

    Photo: Dominick Reuter

    Full Screen

Inexpensive ‘nano-camera’ can operate at the speed of light

Device could be used in medical imaging, collision-avoidance detectors for cars, and interactive gaming.


A $500 “nano-camera” that can operate at the speed of light has been developed by researchers in the MIT Media Lab.

The three-dimensional camera, which was presented last week at Siggraph Asia in Hong Kong, could be used in medical imaging and collision-avoidance detectors for cars, and to improve the accuracy of motion tracking and gesture-recognition devices used in interactive gaming.

The camera is based on “Time of Flight” technology like that used in Microsoft’s recently launched second-generation Kinect device, in which the location of objects is calculated by how long it takes a light signal to reflect off a surface and return to the sensor. However, unlike existing devices based on this technology, the new camera is not fooled by rain, fog, or even translucent objects, says co-author Achuta Kadambi, a graduate student at MIT.

“Using the current state of the art, such as the new Kinect, you cannot capture translucent objects in 3-D," Kadambi says. “That is because the light that bounces off the transparent object and the background smear into one pixel on the camera. Using our technique you can generate 3-D models of translucent or near-transparent objects.”
 

In a conventional Time of Flight camera, a light signal is fired at a scene, where it bounces off an object and returns to strike the pixel. Since the speed of light is known, it is then simple for the camera to calculate the distance the signal has travelled and therefore the depth of the object it has been reflected from.

Unfortunately though, changing environmental conditions, semitransparent surfaces, edges, or motion all create multiple reflections that mix with the original signal and return to the camera, making it difficult to determine which is the correct measurement.

Instead, the new device uses an encoding technique commonly used in the telecommunications industry to calculate the distance a signal has travelled, says Ramesh Raskar, an associate professor of media arts and sciences and leader of the Camera Culture group within the Media Lab, who developed the method alongside Kadambi, Refael Whyte, Ayush Bhandari, and Christopher Barsi at MIT and Adrian Dorrington and Lee Streeter from the University of Waikato in New Zealand.

“We use a new method that allows us to encode information in time,” Raskar says. “So when the data comes back, we can do calculations that are very common in the telecommunications world, to estimate different distances from the single signal.”

The idea is similar to existing techniques that clear blurring in photographs, says Bhandari, a graduate student in the Media Lab. “People with shaky hands tend to take blurry photographs with their cellphones because several shifted versions of the scene smear together,” Bhandari says. “By placing some assumptions on the model — for example that much of this blurring was caused by a jittery hand — the image can be unsmeared to produce a sharper picture.”

The new model, which the team has dubbed nanophotography, unsmears the individual optical paths.

In 2011 Raskar’s group unveiled a trillion-frame-per-second camera capable of capturing a single pulse of light as it travelled through a scene. The camera does this by probing the scene with a femtosecond impulse of light, then uses fast but expensive laboratory-grade optical equipment to take an image each time. However, this “femto-camera” costs around $500,000 to build.

In contrast, the new “nano-camera” probes the scene with a continuous-wave signal that oscillates at nanosecond periods. This allows the team to use inexpensive hardware — off-the-shelf light-emitting diodes (LEDs) can strobe at nanosecond periods, for example — meaning the camera can reach a time resolution within one order of magnitude of femtophotography while costing just $500.

“By solving the multipath problem, essentially just by changing the code, we are able to unmix the light paths and therefore visualize light moving across the scene,” Kadambi says. “So we are able to get similar results to the $500,000 camera, albeit of slightly lower quality, for just $500.”

Conventional cameras see an average of the light arriving at the sensor, much like the human eye, says James Davis, an associate professor of computer science at the University of California at Santa Cruz. In contrast, the researchers in Raskar’s laboratory are investigating what happens when they take a camera fast enough to see that some light makes it from the “flash” back to the camera sooner, and apply sophisticated computation to the resulting data, Davis says.

“Normally the computer scientists who could invent the processing on this data can’t build the devices, and the people who can build the devices cannot really do the computation,” he says. “This combination of skills and techniques is really unique in the work going on at MIT right now.”

What’s more, the basic technology needed for the team’s approach is very similar to that already being shipped in devices such as the new version of Kinect, Davis says. “So it’s going to go from expensive to cheap thanks to video games, and that should shorten the time before people start wondering what it can be used for,” he says. “And by the time that happens, the MIT group will have a whole toolbox of methods available for people to use to realize those dreams."


Topics: Time-of-flight cameras, 3-D, Ramesh Raskar

Comments

I'm interested in computer vision and object pattern recognition. I'm working towards making significant contributions to this field very soon.
i can it for toy design tomorrow
How different is the technology of the kinect sensor compared to this new camera? According to the article you solved the multipath problem through a complex algorithm, but how does the kinect solves this problem? Thank you in advance for an answer. Apart that, really interesting article.
I am especially pleased that there might someday be an actual consumer product that comes from this fascinating research. Hacking Kinect has become a computer phenomenon and the possibilities of using your advanced technology, someday, in a similar fashion could advance the potential uses of this device in areas such as 3D printing or gesture control. On another subject, I have followed the work done by the MIT media lab for many years. 5 years ago, just as 3D TV and such was just taking off I lost the use of my right eye. After waiting over 55 years to watch 3D media, I was robbed of that possibility at the at the last moment. Is any research being done to simulate 3D reception using a monocle device. I don't know if this is even possible. I figure if anyone could accomplish this it would be the MIT Media lab. There could be additional uses for 1 eyed 3D such as mono vision devices like Google glasses, where you only see the screen through 1 eye. Thanks for all you do
I need one of those! Please start a crowd funding campaign and let me know when you do!
It remembers me of using maximum length sequences for getting impulse responses of audio signals. Are you using pseudo random signals like these? The results are impressive!
Back to the top