Object-Based Media
Our group explores how the distribution of computational intelligence throughout video and audio communication systems can make a richer connection between the people at each end. In particular, we seek to build systems which represent content as a collection of meaningful objects accompanied by procedural metadata. To support this vision, we develop not only applications and tools, but also novel content-understanding methods and hardware/software systems.
Research Projects
Consumer Holo-Video
The goal of this project, building upon work begun by Stephen Benton and the Spatial Imaging group, is to create an inexpensive desktop monitor for a PC or game console that displays holographic video images in real time, suitable for entertainment, engineering, or medical imaging. To date, we have demonstrated the fast rendering of holo-video images from OpenGL databases on off-the-shelf PC graphics cards; current research addresses new optoelectronic architectures to reduce the size and manufacturing cost of the display system.
Everything Tells a Story
In a follow-up step after the Graspables project, we are exploring what happens when a wide range of everyday consumer products can sense, interpret into human terms (using pattern recognition methods), and retain memories, such that users can construct a narrative with the aid of the recollections of the "diaries" of their sporting equipment, luggage, furniture, toys, and other things with which they interact.
Guided-Wave Light Modulator
We are developing inexpensive, efficient, high-bandwidth light modulators based on lithium niobate guided-wave technology. These modulators are suitable for demanding, specialized applications such as holographic video displays, as well as other light modulation uses such as compact video projectors.
SurroundVision
We are exploring technical and creative implications of using a mobile phone (and possibly also dedicated devices like toys) as a controllable "second screen" for enhancing television viewing. Thus a viewer could use the phone to look beyond the edges of the television to see the audience for a studio-based program, to pan around a sporting event, to take snapshots for a scavenger hunt, or to simulate binoculars to zoom in on a part of the scene.
The "Bar of Soap": Grasp-Based Interfaces
We have built several handheld devices that combine grasp and orientation sensing with pattern recognition in order to provide highly intelligent user interfaces. The Bar of Soap is a handheld device that senses the pattern of touch and orientation when it is held, and reconfigures to become one of a variety of devices, such as phone, camera, remote control, PDA, or game machine. Pattern-recognition techniques allow the device to infer the user's intention based on grasp. Another example is a baseball that determines a user's pitching style as an input to a video game.
uCom
uCom (the "u" stands for "ubiquitous") is a follow-on to our iCom (Media Lab/Media Lab Europe, 1999) system for connecting architectural spaces to enable collaboration by distributed groups. uCom takes advantage of input/output resources (e.g., displays, cameras, speakers, sensors) already in place; it is not restricted to one "window" in one location, but rather creates multiple video and audio portals between the spaces; it scales in richness as the number of input and output devices increase; and it has both a real-world and a virtual-world presence. The virtual-world model can be accessed by users in other places, or for replay of past events. uCom also enables the aggregating of sensor data to allow processes to draw higher-level inferences (e.g., for understanding the actions of the community of users), monitoring individual users (e.g., for health or disability reasons), or augmenting the space with additional virtual information.
Vision-Based Interfaces for Mobile Devices
Mobile devices with cameras have enough processing power to do simple machine-vision tasks, and we are exploring how this capability can enable new user interfaces to applications. Examples include dialing someone by pointing the camera at the person's photograph, or using the camera as an input to allow navigating virtual spaces larger than the device's screen.