Project

Realtime Detection of Social Cues

Jin Joo Lee

Project Contact:

Groups

Realtime detection of social cues in children’s voices

In everyday conversation, people use what are known as backchannels to signal to someone that they are still listening, paying attention, and engaged. As listeners, we smile, nod, and say “uh-huh” to convey attentiveness, and we do this naturally with little thought. We give this feedback not randomly but at certain moments in the conversation because speakers give off social cues that signal upcoming backchanneling opportunities.

A robot listener will need to detect for these social cues to carefully time its responses. We developed a realtime rule-based model that detects for these cues based on the prosody of the speaker’s voice. From low-level speech features, the model detects for significant changes in pitch, energy shifts, long pauses, and long utterances. Its model parameters were trained and tested on a dataset of children’s voices. We then used this model to trigger contingent behaviors of a listening robot, and children were highly engaged with the robots as they told them stories about their day.

Research Topics

#artificial intelligence #data #human-machine interaction #kids #storytelling #social science #machine learning #social robotics #nonverbal behavior #affective computing