Modeling the Interaction between Speech and Gesture

Justine Cassell, Mark Steedman, Norman Badler, Catherine Pelachaud, Matthew Stone, Brett Douville, Scott Prevost and Brett Achorn

Abstract

This paper describes an implemented system that generates spoken dialogue, including speech, intonation, and gesture, using two copies of an identical program that differ only in knowledge of the world and which must cooperate to accomplish a goal. The output of the dialogue generation is used to drive a three-dimensional interactive animated model -- two graphic figures on a computer screen who speak and gesture according to the rules of the system. The system is based upon a formal, predictive and explanatory theory of the gesture-speech relationship. A felicitous outcome is a working system to realize autonomous animated conversational agents for virtual reality and other purposes, and a tool for investigating the relationship between speech and gesture.