ANIMATED CONVERSATION: Rule-based Generation of Facial Expression, Gesture & Spoken Intonation for Multiple Conversational Agents

Justine Cassell Catherine Pelachaud Norman Badler Mark Steedman
Brett Achorn Tripp Becket Brett Douville Scott Prevost Matthew Stone 1
Department of Computer & Information Science, University of Pennsylvania

We describe an implemented system which automat-ically generates and animates conversations between multiple human-like agents with appropriate and synchronized speech, intonation, facial expressions, and hand gestures. Conversations are created by a dialogue planner that produces the text as well as the intonation of the utterances. The speaker/listener relationship, the text, and the intonation in turn drive facial expres-sions, lip motions, eye gaze, head motion, and arm gesture generators. Coordinated arm, wrist, and hand motions are invoked to create semantically meaningful gestures. Throughout, we will use examples from an actual synthesized, fully animated conversation. animation. In people, speech, facial expressions, and gestures are physiologically linked. While an expert animator may realize this unconsciously in the “look” of a properly animated character, a program to automatically generate motions must know the rules in advance. This paper presents a working systemto realize interacting animated agents. Conversation is an interactive dialogue between two agents. Conversation includes spoken language (words and contextually appropriate intonation marking topic and focus), facial movements (lip shapes, emotions, gaze direction, head motion), and hand ges-tures (handshapes, points, beats, and motions representing the topic of accompanying speech). Without all of these verbal and non-verbal behaviors, one cannot have realistic, or at least believable, autonomous agents. To limit the problems (such as voice and face recognition)