Speech Interface Group

The Speech Interface Group develops applications, services, and user interfaces employing computer speech technologies: speech recognition, text-to-speech synthesis, digital audio recording, and digital signal processing. Rather than develop new underlying algorithms, our goal is to develop new ways of thinking about speech for interaction with computers, and computers for assisting in communication between people. Our work leads us to considering voice across a variety of interaction landscapes, from desktop computers to telephones to various portable audio and communication devices. Voice allows us to compute and communicate electronically outside of traditional computing environments.

Speech at the user interface requires developing dialog systems and understanding the cognitive constraints audio-only interaction. Auditory interfaces are most successful where they fill a real need, possibly extending a service to a location otherwise inappropriate, such as driving a car. Speech interfaces are difficult to design effectively and best flourish where they employ normal human conversational techniques.

Employing voice as a data type comprises both analysis and presentation. Analysis of the acoustic structure inherent in speech allows applications to exploit segmental cues, such as speaker changes, emphasis, or topic shift, to allow browsing and gisting. Presentation techniques such as time scale modification and simultaneous presentation of spatialized audio streams must be applied to interaction frameworks such as real or virtual acoustic environments and physical affordances on portable devices.

Speech was developed by humans who wanted to communicate with each other. By participating in this communication computers may enhance our lives.

home | projects | papers | videos | people | research agenda | news | new students

Comments to webmaster. Last updated Oct 29, 1998