The Human Speechome Project
We seek to better understand how children learn the meaning of words
through analysis of observational recordings of child-caregiver
interactions in natural contexts. Currently available corpora greatly
under-sample crucial early stages of child development. As a result,
our understanding of language acquisition hinges on surprisingly
sparse and incomplete data. Motivated by this basic problem, Roy has
begun a pilot project in which he is recording his son's development
at home by gathering approximately 10 hours of high fidelity audio
and video on a daily basis from birth to age three. The resulting
corpus, which already contains over 100,000 hours of multi-track
recordings, constitutes the most comprehensive record of a child's
development made to date. This data provides many new opportunities
to understand the fine-grained dynamics of language development.
A principal challenge of the project is to efficiently transcribe and
annotate the massive corpus. New software algorithms and human-
computer interfaces will be developed that enable a small team of
researchers to quickly and accurately code the raw data semi-
automatically. Using these software tools, we plan to study and
computationally model the early words uttered by the child by tracing
back to the contexts in which they were used by adults speaking to him.
For most children, language development is steady, progressive, and
to a casual observer effortless. But for some children -- those with
developmental delays due to biological or environmental causes --
language is a major developmental hurdle. Understanding the
regularities in home environments is essential to understanding
mechanisms of language acquisition, causes of delay, and ultimately
appropriate intervention procedures. We believe this project will shed new light on fundamental aspects of how child-caregiver social interactions shape language acquisition.
Although there are clear limits to what may be concluded from
studying a single child, in the time-honored tradition of
longitudinal case studies dating back to Piaget, the findings from
this project may guide more extensive follow-on observational and
experimental studies. Beyond the Speechome corpus, the development of
an effective semi-automated data coding and analysis methodology may
enable scientists to leverage high density audio-visual corpora to
address numerous open questions in the behavioral sciences.
Privacy Statement
Audio and video recording of children in their homes is a widely used
method with mature ethical norms that is well established in the
field of developmental psychology (e.g., see http://childes.psy.cmu.edu/). Our project is distinct due to the unusual
sampling density of the recordings. There is no plan to distribute or
publish the complete original recordings due to privacy
considerations, although we will explore ways to work with other
researchers by sharing appropriately coded and selected portions of
the full corpus.
Media
H2.0 presentation on the Speechome project by Professor Roy.
Sample video image from the kitchen. [JPG, 101K]
Timelapse video of a day of life at home. [QuickTime, 3.5 megs]
Evolution of "water" over several months. [WAV, 3.5 megs]
Video collage of "ball" over several months. [QuickTime, 1.2 megs]
Video visualization of caregiver and child interaction. [high resolution PNG (3M) | low resolution PNG (145K)]
Dynamic generation of video visualization. [QuickTime, 3.6 megs]
Photo/video credit: MIT Media Lab
Publications
Deb Roy, Rupal Patel, Philip DeCamp, Rony Kubat, Michael Fleischman, Brandon Roy, Nikolaos Mavridis, Stefanie Tellex, Alexia Salata, Jethran Guinness, Michael Levit, Peter Gorniak. (2006). The Human Speechome Project. Proceedings of the 28th Annual Cognitive Science Conference. pdf (756K)
|