P2PSTORY: Dataset of children as storytellers and listeners in peer-to-peer interactions


Jin Joo Lee

 Jin Joo Lee

Understanding social-emotional behaviors in storytelling interactions plays a critical role in the development of interactive and educational technologies for children. A challenge when designing for such interactions using technologies like social robots, virtual agents, and tablets is understanding the social-emotional behaviors pertinent to the storytelling context—especially when emulating a natural peer-to-peer relationship between the child and the technology.  We present P2PSTORY, a dataset of young children (5-6 years old) engaging in natural peer-to-peer storytelling interactions with fellow classmates. The dataset contains 58 recorded storytelling sessions along with a diverse set of behavioral annotations as well as developmental and demographic profiles of each child participant. 

The CHI 2018 paper presenting this dataset can be found here: 
Nikhita Singh, Jin Joo Lee, Ishaan Grover, and Cynthia Breazeal (2018). P2PSTORY: Dataset of Children Storytelling and Listening in Peer-to-Peer Interactions. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems.

See below for instructions on how to access the dataset.


Personal Robots Group

Included Features

1. Video and audio: Recordings were collected for each session from three time-sychronized cameras (see image view above) and a high quality microphone.

2. Behavioral features: Video recordings were coded for a wide range of behaviors including gaze, posture, nods, smiles and frowns, eyebrow movement, and backchannel utterances. In addition, interaction-level features were annotated including listener’s attention and whether dyads were on or off task.

3. Prosodic features: The storyteller’s use of prosodic cues including pitch, energy, pauses, filled pauses, and long utterances were also annotated.

4. Personal features: Demographic and socio-emotional development profiles were collected for each participant.

5. Child perceptions: To better understand how children perceived the effectiveness of their interaction partner, participants were asked to rate their partner on measures relating to attention and understanding. 

Request for Access

To access the dataset for academic, non-profit, or research purposes only, please send an email to with the following information:  

  1. Full Name 
  2. Job Title
  3. Academic Affiliation
  4. Research Group Website
  5. Signed Data Use Agreement Document - Signatures of the lead PI and all students using the dataset must be included.



  • Question: The Data Use Agreement has an expiration date.  Is it possible to request for a renewal?
    Answer: Yes!  

  •  Question: The Data Use Agreement says that we need to obtain MIT's written content before we can "present, submit for publication, publicly post or publish any information contained in or derived from the DATA."  What is required for MIT's content and agreement?
    Answer: The only restriction is demonstrating that the following paper is properly cited: Nikhita Singh, Jin Joo Lee, Ishaan Grover, and Cynthia Breazeal (2018).  P2PSTORY: Dataset of Children Storytelling and Listening in Peer-to-Peer Interactions. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems.

  • Question: Are there any additional papers or resources related to this dataset?
    Answer:  Yes!
    1) JJ Lee, C Breazeal, D DeSteno (2017). Role of Speaker Cues in Attention Inference. Frontiers in Robotics and AI.
    2) HW Park, M Gelsomini, JJ Lee, and C Breazeal (2017). Telling Stories to Robots: The Effect of Backchanneling On A Child’s Storytelling. In Proceedings of the International Conference on Human-Robot Interaction (HRI).