Speech + Mobility
How speech technologies and portable devices can enhance communication.
The Speech + Mobility group uses speech technologies and portable devices to enhance human communication and make digitized audio more useful as a data type. Our focus is on developing novel applications, user interfaces, and services to exploit computer speech processing for interacting with and through computers far removed from keyboards and monitors.

Research Projects

  • Back Talk

    Chris Schmandt and Andrea Colaco
    The living room is the heart of social and communal interactions in a home. Often present in this space is a screen: the television. When in use, this communal gathering space brings together people and their interests, and their varying needs for company, devices, and content. This project focuses on using personal devices such as mobile phones with the television; the phone serves as a controller and social interface by offering a channel to convey engagement, laughter, and viewer comments, and to create remote co-presence.
  • Flickr This

    Chris Schmandt and Dori Lin
    Inspired by the fact that people are communicating more and more through technology, Flickr This explores ways for people to have emotion-rich conversations through all kinds of media provided by people and technology. By grounding them in shared media, the technology allows remote people to have conversations that are more like face-to-face experiences. Flickr This lets viewable content provide structure for a conversation; conversation can move between synchronous and asynchronous, and evolve into a richer collaborative conversation/media.
  • frontdesk

    Chris Schmandt and Andrea Colaco

    Calling a person versus calling a place has quite distinctive affordances. With the arrival of mobile phones, the concept of calling has moved from calling a place to calling a person. Frontdesk proposes a place-based communication tool that is accessed primarily through any mobile device and features voice calls and text chat. The application uses “place” loosely to define a physical space created by a group of people that have a shared context of that place. Examples of places could be different parts of a workspace in a physical building, such as the machine shop, café, or Speech + Mobility group area at the Media Lab. When a user calls any of these places, frontdesk routes their call to all people that are “checked-in” to the place.

  • Going My Way

    Chris Schmandt and Jaewoo Chung
    When friends give directions, they often don't describe the whole route, but instead provide landmarks along the way which with they think will be familiar. Friends can assume we have certain knowledge because they know our likes and dislikes. Going My Way attempts to mimic a friend by learning about where you travel, identifying the areas that are close to the desired destination from your frequent path, and picking a set of landmarks to allow you to choose a familiar one. When you select one of the provided landmarks, Going My Way will provide directions from it to the destination.
  • Guiding Light

    Chris Schmandt, Jaewoo Chung, Ig-Jae Kim and Kuang Xu
    Guiding Light is a navigation-based application that provides directions by projecting them onto physical spaces both indoors and outdoors. It enables a user to get relevant spatial information by using a mini projector in a cell phone. The core metaphor involved in this design is that of a flashlight which reveals objects in and information about the space it illuminates. For indoor navigation, Guiding Light uses a combination of e-compass, accelerometer, proximity sensors, and tags to place information appropriately. In contrast to existing heads-up displays that push information into the user's field of view, Guiding Light works on a pull principle, relying entirely on users' requests and control of information.
  • Indoor Location Sensing Using Geo-Magnetism

    Chris Schmandt, Jaewoo Chung, Nan-Wei Gong, Wu-Hsi Li and Joe Paradiso

    We present an indoor positioning system that measures location using disturbances of the Earth's magnetic field by structural steel elements in a building. The presence of these large steel members warps the geomagnetic field such that lines of magnetic force are locally not parallel. We measure the divergence of the lines of the magnetic force field using e-compass parts with slight physical offsets; these measurements are used to create local position signatures for later comparison with values in the same sensors at a location to be measured. We demonstrate accuracy within one meter 88 percent of the time in experiments in two buildings and across multiple floors within the buildings.

  • InterTwinkles

    Chris Schmandt and Charlie DeTar

    Bringing deliberative process and consensus decision-making to the 21st century! A practical set of tools for assisting in meeting structure, deliberative process, brainstorming, and negotiation. Helping groups to democratically engage with each other, across geographies and time zones.

  • LocoRadio

    Chris Schmandt and Wu-Hsi Li

    LocoRadio is a mobile, augmented-reality, audio browsing system that immerses you within a soundscape as you move. To enhance the browsing experience in high-density spatialized audio environments, we introduce a UI feature, "auditory spatial scaling," which enables users to continuously adjust the spatial density of perceived sounds. The audio will come from a custom, geo-tagged audio database. The current demo uses iconic music to represent restaurants. As users move in the city, they encounter a series of pieces of music and the perception enhances their awareness of the numbers, styles, and locations of nearby restaurants.

  • Mime

    Andrea Colaco

    Mime is a compact, low-power 3D sensor for short-range gestural control of small display devices. The sensor's performance is based on a novel signal processing pipeline that combines low-power time-of-flight (TOF) sensing for 3D hand-motion tracking with RGB image-based computer vision algorithms for finer gestural control. Mime is an addition to a growing number of input devices developed around the engineering design philosophy of sacrificing generality for battery-friendly and accurate performance to retain the portability advantages of our smart devices. We demonstrate the utility of Mime for Head Mounted Display control and smart phones with a variety of application scenarios, including 3D spatial input using close range gestures, gaming, on-the-move interaction, and operation in cluttered environments and in broad daylight conditions.

  • Musicpainter

    Chris Schmandt, Barry Vercoe and Wu-Hsi Li
    Musicpainter is a networked, graphical composing environment that encourages sharing and collaboration within the composing process. It provides a social environment where users can gather and learn from each other. The approach is based on sharing and managing music creation in small and large scales. At the small scale, users are encouraged to begin composing by conceiving small musical ideas, such as melodic or rhythmic fragments, all of which are collected and made available to all users as a shared composing resource. The collection provides a dynamic source of composing material that is inspiring and reusable. At the large scale, users can access full compositions that are shared as open projects. Users can listen to and change any piece. The system generates an attribution list on the edited piece, allowing users to trace how it evolves in the environment.
  • OnTheRun

    Chris Schmandt and Matthew Joseph Donahoe

    OnTheRun is a location-based exercise game designed for the iPhone. The player assumes the role of a fugitive trying to gather clues to clear his name. The game is played outdoors while running, creating missions that are tailored to the player's neighborhood and running ability. The game is primarily an audio experience, and gameplay involves following turn-by-turn directions, outrunning virtual enemies, and reaching destinations.

  • Pavlov

    Chris Schmandt and Sujoy Kumar Chowdhury

    Pavlov is a virtual pet that encourages you to be physically active. He has ambient presence in the screens with which you interact. Pavlov is happy and healthy when you have walked a certain number of steps every day. When you are sedentary for a while, Pavlov nags you to take him out for a walk. He also craves to be the leader of all Pavlovs in your area. He can only be so when you, as his owner, become the most physically active person amongst your friends. Pavlov pings you every day at a certain time, telling you that he is going to have a dog-fight with other Pavlovs. You have the option to watch the dog-fight. Otherwise Pavlov simply tells you if he has won the fight, which may indicate that today you were more physically active than your friends.

  • Puzzlaef

    Chris Schmandt, Sinchan Banerjee, and Drew Harry

    How can one understand and visualize the lifestyle of a person on the other side of the world? Puzzlaef attempts to tackle this question through a mobile picture puzzle game, which users collaboratively solve with pictures from their lifestyles.

  • Radio-ish Media Player

    Chris Schmandt, Barry Vercoe and Wu-Hsi Li
    How many decisions does it take before you hear a desired piece of music on your iPod? First, you are asked to pick a genre, then an artist, then an album, and finally a song. The more songs you own, the tougher the choices are. To resolve these issues, we have turned the modern music player into an old analog radio tuner, the Radio-ish Media Player. No LCDs, no favorite channels, just a knob that will help you surf through channel after channel accompanied by synthesized noise. Radio-ish is our attempt to revive the lost art of channel surfing in the old analog radio tuner. Let music find you: your ears will tell you if the music is right. This project is not only a retrospective design, but also our reflection on lost simplicity in the process of digitalization. A mobile phone version is also available for demo.
  • ROAR

    Chris Schmandt and Drew Harry

    The experience of being in a crowd is visceral. We feel a sense of connection and belonging through shared experiences like watching a sporting event, speech, or performance. In online environments, though, we are often part of a crowd without feeling it. ROAR is designed to allow very large groups of distributed spectators have meaningful conversations with strangers or friends while creating a sense of presence of thousands of other spectators. ROAR is also interested in creating opportunities for collective action among spectators and providing flexible ways to share content among very large groups. These systems combine to let you feel the roar of the crowd even if you're alone in your bedroom.

  • SeeIt-ShareIt

    Chris Schmandt, Andrea Colaco

    Now that mobile phones are starting to have 3D display and capture capabilities, there are opportunities to enable new applications that enhance person-person communication or person-object interaction. This project explores one such application: acquiring 3D models of objects using cell phones with stereo cameras. Such models could serve as shared objects that ground communication in virtual environments and mirrored worlds or in mobile augmented reality applications.

  • Spellbound

    Misha Sra and Chris Schmandt

    Turning screen time into activity time, Spellbound is a cooperatively competitive real-time, real-world multiplayer mobile game. It uses a fantasy game context to connect and bring people together around a shared experience, create serendipitous connections, and encourage new kinds of activities in existing physical spaces. The game is freed from the screen and interlaced with the real world by using mobile phones. The game system uses activity detection via sensors on the mobile phone and presence and location detection via GPS. Communication with the game is done using speech interaction with the phone, and output is displayed on the phone screen as well as through a custom wristband interface. Spellbound explores the space between real-world physical activities and fantastical video-game worlds as a place to create new social experiences for both players and audience.

  • Spotz

    Chris Schmandt and Misha Sra

    Exploring your city is a great way to make friends, discover new places, find new interests, and invent yourself. Spotz is an Android app where everyone collectively defines the places they visit and the places in turn define them. Spotz allows you to discover yourself by discovering places. You tag a spot, create some buzz for it and, if everyone agrees the spot is 'fun' this bolsters your 'fun' quotient. If everyone agrees the spot is 'geeky' it pushes up your ‘geeky’ score. Thus emerges your personal tag cloud. Follow tags to chance upon new places. Find people with similar 'tag clouds' as your own and experience new places together. Create buzz for your favorite spots and track other buzz to find who has the #bestchocolatecake in town!

  • Tin Can

    Chris Schmandt, Matthew Donahoe and Drew Harry

    Distributed meetings present a set of interesting challenges to staying engaged and involved. Because one person speaks at a time, it is easy (particularly for remote participants) to disengage from the meeting undetected. However, non-speaking roles in a meeting can be just as important as speaking ones, and if we could give non-speaking participants ways to participate, we could help support better-run meetings of all kinds. Tin Can collects background tasks like taking notes, managing the agenda, sharing relevant content, and tracking to-dos in a distributed interface that uses meeting participants' phones and laptops as input devices, and represents current meeting activities on an iPad in the center of the table in each meeting location. By publicly representing these background processes, we provide meeting attendees with new ways to participate and be recognized for their non-verbal participation.

  • Tin Can Classroom

    Chris Schmandt, Drew Harry and Eric Gordon (Emerson College)

    Classroom discussions may not seem like an environment that needs a new kind of supporting technology. But we've found that augmenting classroom discussions with an iPad-based environment to help promote discussion, keep track of current and future discussion topics, and create a shared record of class keeps students engaged and involved with discussion topics, and helps restart the discussion when conversation lags. Contrary to what you might expect, having another discussion venue doesn't seem to add to student distraction; rather it tends to focus distracted students on this backchannel discussion. For the instructor, our system offers powerful insights into the engagement and interests of students who tend to speak less in class, which in turn can empower less-active students to contribute in a venue in which they feel more comfortable.