• Login
  • Register

Work for a Member company and need a Member Portal account? Register here with your company email address.

Promoting deeper learning and understanding in human networks

Peter Beshai


  • Belen Saldias F. and Deb Roy. (2020). Exploring aspects of similarity between spoken personal narratives by disentangling them into narrative clause types. In Proceedings of the 2020 ACL Workshop on Narrative Understanding, Storylines, and Events (NUSE). ACL, 2020. pdf
  • Bridgit Mendler. (2020). Our Story: Dispute system design technology for stakeholder inclusion. Masters' Thesis, MIT Media Lab. pdf
  • Maggie Hughes. (2020). Keeper: Online conversation support scaffolding modeled after ancient and modern social technologies. Masters' Thesis, MIT Media Lab. pdf


  • Doug Beeferman, William Brannon, and Deb Roy. (2019). RadioTalk: A large-scale corpus of talk radio transcripts. In Proceedings of the 20th Conference of the International Speech Communication Association (INTERSPEECH 2019). Graz, Austria. 
  • Belen Saldias F. and Rosalind W. Picard. (2019). Tweet Moodifier: Towards giving emotional awareness to Twitter users. In Proceedings of the 8th International Conference on Affective Computing and Intelligent Interaction (ACII 2019). Cambridge, UK. pdf
  • David McClure. (2019). Headlines as networked language: A study of content and audience across 73 million links on Twitter. Masters' Thesis, MIT Media Lab.


  • S. Vosoughi, D. Roy, S. Aral. (2018). The spread of true and false news online. Science. Vol 359, Iss 6380. Mar 09 2018. more details
  • Anneli Hershman*, Juliana Nazare*, Ivan Sysoev, Lauren Fratamico, Juanita Buitrago, Mina Soltangheis, Sneha Makini, Eric Chu, and Deb Roy. (2018). Family Learning Coach: Engaging Families in Children’s Early Literacy Learning with Computer-Supported Tools. In Proceedings of the 26th International Conference on Computers in Education. Philippines, pp. 637-646. *Equal Contribution. pdf
  • Juliana Nazare*, Anneli Hershman*, Ivan Sysoev, Lauren Fratamico, Juanita Buitrago, Mina Soltangheis, Sneha Makini, Eric Chu, and Deb Roy. (2018). Child-Coach-Parent Network for Early Literacy Learning. In Proceedings of the 13th International Conference of Learning Sciences. London, United Kingdom, pp. 1409-1410. *Equal Contribution. pdf
  • Anneli Hershman*, Juliana Nazare*, Jie Qi*, Martin Saveski, Deb Roy, and Mitchel Resnick. (2018). Light It Up: Using Paper Circuitry to Enhance Low-Fidelity Paper Prototypes for Children. In Proceedings of the 17th ACM Conference on Interaction Design and Children. ACM, Trondheim, Norway, pp. 365-372. *Equal Contribution.  pdf


  • Juliana Nazare*, Anneli Hershman*, Ivan Sysoev, and Deb Roy. (2017). Bilingual SpeechBlocks: Investigating How Bilingual Children Tinker with Words in English and Spanish. In Proceedings of the Annual Symposium on Computer-Human Interaction in Play (CHI PLAY ‘17). ACM, Amsterdam, Netherlands, pp. 183-193. *Equal Contribution. pdf 
  • Soroush Vosoughi*, Prashanth Vijayaraghavan*, Ann Yuan, and Deb Roy. (2017). Mapping Twitter Conversation Landscapes. In Proceedings of the 11th International AAAI Conference on Weblogs and Social Media (ICWSM 2017). Montreal, Canada. *Equal Contribution. pdf 
  • Iris Chin, Matthew S. Goodwin, Soroush Vosoughi, Deb Roy, and Letita R. Naigles. (2017). Dense home-based recordings reveal typical and atypical development of tense/aspect in a child with delayed language development. Journal of Child Language (2017): 1-34. pdf
  • Prashanth Vijayaraghavan, Soroush Vosoughi, Ann Yuan, and Deb Roy. (2017). TweetVista: An AI-Powered Interactive Tool for Exploring Conversations on Twitter. In Proceedings of the 22nd International Conference on Intelligent User Interfaces Companion, pp. 145-148. ACM, 2017.pdf 
  • Misha Sra, Prashanth Vijayaraghavan, Ognjen Rudovic, Pattie Maes, and Deb Roy. (2017). DeepSpace: Mood-Based Image Texture Generation for Virtual Reality from Music. IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 10 pages. pdf 
  • Soroush Vosoughi, Mostafa ‘Neo’ Mohsenvand, and Deb Roy. (2017). Rumor Gauge: Predicting the Veracity of Rumors on Twitter. ACM Transactions on Knowledge Discovery from Data (TKDD), 36 pages. pdf
  • Ivan Sysoev, Anneli Hershman, Susan Fine, Claire Traweek, and Deb Roy. (2017). SpeechBlocks: A Constructionist Early Literacy App. Proceedings of the 2017 Conference on Interaction Design and Children, 10 pages. 
  • Prashanth Vijayaraghavan, Soroush Vosoughi, and Deb Roy. (2017). Twitter Demographic Classification Using Deep Multi-modal Multi-task Learning. Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, 6 pages. pdf


  • Ivan Sysoev, Anneli Hershman, Susan Fine, Deb Roy, Mina Soltangheis, and Brianne Fitzpatrick. (2016). Exploring SpeechBlocks: Piloting a Constructionist Literacy App with Preschool Children. Presented at the 2016 American Speech-Language-Hearing Association (ASHA) Conference. Philadelphia, PA. pdf 
  • Anneli Hershman, Ivan Sysoev, and Deb Roy. (2016). SpeechBlocks: Using Literacy Apps as Building Blocks to Analyze Play. Presented at the Conference for Digital Media and Learning (DML). Irvine, CA. pdf 
  • Prashanth Vijayaraghavan, Ivan Sysoev, Soroush Vosoughi and Deb Roy. (2016). DeepStance at SemEval-2016 Task 6: Detecting Stance in Tweets Using Character and Word-Level CNNs. In Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016). San Diego, California. pdf 
  • Soroush Vosough and Deb Roy. (2016). A Semi-automatic Method for Efficient Detection of Stories on Social Media. In Proceedings of the 10th International AAAI Conference on Weblogs and Social Media (ICWSM 2016). Cologne, Germany. pdf 
  • Soroush Vosough*, Prashanth Vijayaraghavan* and Deb Roy. (2016). Tweet2Vec: Learning Tweet Embeddings using Character-level CNN-LSTM Encoder-Decoder. Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2016). Pisa, Italy. *Equal Contribution. pdf 
  • Martin Saveski, Sophie Chou, and Deb Roy. (2016). Tracking the Yak: An Empirical Study of Yik Yak. In proceedings of 10th International Conference on Web Search and Data Mining (ICWSM’16). Cologne, Germany. pdf 
  • Prashanth Vijayaraghavan*, Soroush Vosoughi*, and Deb Roy. (2016). Automatic Detection and Categorization of Election-Related Tweets. In Proceedings of the 10th International AAAI Conference on Weblogs and Social Media (ICWSM 2016). Cologne, Germany. *Equal Contribution. pdf
  • David Alvarez-Melis and Martin Saveski. (2016). Topic Modeling in Twitter: Aggregating Tweets by Conversations. In proceedings of 10th International Conference on Web Search and Data Mining (ICWSM’16). Cologne, Germany. pdf 
  • Soroush Vosoughi, and Deb Roy. (2016). Tweet Acts: A Speech Act Classifier for Twitter. In Proceedings of the 10th International AAAI Conference on Weblogs and Social Media (ICWSM 2016). Cologne, Germany. Cologne, Germany. pdf 
  • Soroush Vosoughi, Russell Stevens, and Deb Roy. (2016). Addressing the Demographic Bias on Twitter. The 71st Annual Conference of the American Association for Public Opinion Research (AAPOR). Austin, Texas. pdf 
  • Martin Saveski, Eric Chu, Soroush Vosoughi, and Deb Roy. (2016). Human Atlas: A Tool for Mapping Social Networks. In proceedings of the 25th International Conference on World Wide Web Companion. Montreal, Canada. pdf 


  • Soroush Vosoughi, Helen Zhou, and Deb Roy. (2015). Digital Stylometry: Linking Profiles Across Social Networks. In proceedings of the 7th International Conference on Social Informatics (SocInfo 2015). Beijing, China. pdf 
  • Soroush Vosoughi and Deb Roy. (2015). A Human-Machine Collaborative System for Identifying Rumors on Twitter. In proceedings of the IEEE ICDM 2015 workshop on Event Analytics using Social Media Data (EASM). Atlantic City, New Jersey. pdf 
  • Brandon C. Roy, Michael C. Frank, Philip DeCamp, Matthew Miller and Deb Roy. (2015). Predicting the Birth of a Spoken Word. Proceedings of the National Academy of Sciences of the United States of America (PNAS). http://www.pnas.org/content/early/2015/09/15/1419773112
  • Soroush Vosoughi, Helen Zhou, and Deb Roy. (2015). Enhanced Twitter Sentiment Classification Using Contextual Information. In proceedings of the EMNLP 2015 workshop on Computational Approaches to Subjectivity, Sentiment & Social Media Analysis (WASSA). Lisboa, Portugal. pdf
  • Pau Perng-Hwa Kung, and Deb Roy. (2015). Measuring Responsiveness in the Online Public Sphere for the 2016 US Election: Concepts. Workshop on Networks in the Social and Information Sciences NIPS. pdf 


  • Brandon C. Roy, Soroush Vosoughi, and Deb Roy. (2014). Grounding language models in spatiotemporal context. Proceedings of Interspeech 2014. Singapore. pdf 
  • Soroush Vosoughi. (2014). Improving automatic speech recognition through head pose driven visual grounding. Proceedings of the ACM SIGCHI Conference on Human Factors in Computing Systems (CHI ’14). ACM, New York, NY, USA, 3235-3238. pdf 
  • Iris Chin, Soroush Vosoughi, Mathew Goodwin, Deb Roy and Letitia Naigles. (2014). How the Speechome Recorder can change our understanding of developmental trajectories. Poster presented at the 13th International Congress for the Study of Child Language. Amsterdam, The Netherlands. pdf 
  • Thomas Kollar, Stefanie Tellex, Deb Roy, and Nicholas Roy. (2014). Grounding verbs of motion in natural language commands to robots.Experimental robotics. Springer Berlin Heidelberg. pdf 


  • Iris Chin, Soroush Vosoughi, Mathew Goodwin, Deb Roy and Letitia Naigles. (2013). Dense Data Collection Through the Speechome Recorder Better Reveals Developmental Trajectories. In the Extended Abstract of the International Meeting for Autism Research (IMFAR) 2013. San Sebastián, Spain. pdf 
  • Iris Chin, Soroush Vosoughi, Emily Potrzeba, Mathew Goodwin, Deb Roy and Letitia Naigles. (2013). Verb use in a child previously diagnosed with ASD: Dense recordings reveal typical and atypical development. In the Extended Abstract of the Biennial Meeting of the Society for Research in Child Development (SRCD) 2013. Seattle, Washington. pdf 
  • Thomas Kollar, Stefanie Tellex, Matthew R. Walter, Albert Huang, Abraham Bachrach, Sachi Hemachandra, Emma Brunskill, Ashis Banerjee, Deb Roy, Seth Teller, and Nicholas Roy. (2013). Generalized grounding graphs: A probabilistic framework for understanding grounded language. Journal of Artificial Intelligence Research (JAIR). 35 pages. pdf 


  • Soroush Vosoughi, Matthew S. Goodwin, Bill Washabaugh, and Deb Roy. (2012). A portable audio/video recorder for longitudinal study of child development. In Proceedings of the 14th ACM International Conference on Multimodal Interaction (ICMI ’12). ACM, New York, NY, USA, 193-200.pdf 
  • Jeff Orkin and Deb Roy. (2012). Understanding Speech in Interactive Narratives with Crowdsourced Data. Proceedings of the Artificial Intelligence and Interactive Digital Entertainment Conference (AIIDE). pdf 
  • Soroush Vosoughi and Deb Roy. (2012). An Automatic Child-Directed Speech Detector for the Study of Child Language Development. Proceedings of Interspeech 2012. Portland, Oregon. pdf 
  • Brandon C. Roy, Michael C. Frank, and Deb Roy. (2012). Relating Activity Contexts to Early Word Learning in Dense Longitudinal Data. Proceedings of the 34th Annual Meeting of the Cognitive Science Society. Sapporo, Japan. (Correction to Figure 4, 5/18/13) pdf 
  • Soroush Vosoughi and Deb Roy. (2012). A longitudinal study of prosodic exaggeration in child-directed speech. Proceedings of the 6th International Conference on Speech Prosody. Shanghai, China. pdf 
  • Iris Chin, Devin Rubin, Andrea Tovar, Soroush Vosoughi, Michelle Cheng, Emily Potrzeba, Mathew Goodwin, Deb Roy, Letitia Naigles. (2012). Dense Recordings of Naturalistic Interactions Reveal both Typical and Atypical Speech in One Child with ASD. Proceedings of the International Meeting for Autism Research (IMFAR). Toronto, Canada. pdf 


  • Hilke Reckman, Jeff Orkin and Deb Roy. (2011). Extracting aspects of determiner meaning from dialogue in a virtual world environment. Proceedings of the International Conference on Computational Semantics (IWCS). Oxford, England. pdf 
  • George Shaw, Deb Roy. (2011). An Interface for Visualization and Exploration of Spatial Distributions. Scalable Integration of Analytics and Visualization. pdf 


  • Jeff Orkin, Tynan Smith, and Deb Roy. (2010). Behavior Compilation for AI in Games. Proceedings of the Artificial Intelligence and Interactive Digital Entertainment Conference (AIIDE). pdf 
  • Stefanie Tellex, Thomas Kollar, George Shaw, Nicholas Roy and Deb Roy. (2010). Grounding Spatial Language for Video Search. Proceedings of the Twelfth International Conference on Multimodal Interfaces (ICMI). Beijing, China. pdf 
  • Best Student Paper Award Philip DeCamp, George Shaw, Rony Kubat and Deb Roy. (2010). An Immersive System for Browsing and Visualizing Surveillance Video. Proceedings of ACM Multimedia 2010. Florence, Italy. pdf 
  • Albert Huang, Stefanie Tellex, Abraham Bachrach, Thomas Kollar, Deb Roy, and Nick Roy. (2010). Natural Language Command of an Autonomous Micro-Air Vehicle. Proceedings of the International Conference on Intelligent Robots and Systems (IROS). Taipei, Taiwan. pdf
  • Meredith Meyer, Philip DeCamp, Bridgette Hard, Dare Baldwin and Deb Roy. (2010). Assessing Behavioral and Computational Approaches to Naturalistic Action Segmentation. Proceedings of the 32nd Annual Cognitive Science Conference. Portland, Oregon. pdf 
  • Brandon C. Roy*, Soroush Vosoughi*, and Deb Roy. (2010). Automatic Estimation of Transcription Accuracy and Difficulty. Proceedings of Interspeech 2010. Makuhari, Japan. pdf 
  • Hilke Reckman, Jeff Orkin and Deb Roy. (2010). Learning meanings of words and constructions, grounded in a virtual game. Proceedings of the 10th Conference on Natural Language Processing (KONVENS). Saarbrücken, Germany. pdf 
  • Soroush Vosoughi, Brandon C. Roy, Michael C. Frank, and Deb Roy. (2010). Contributions of Prosodic and Distributional Features of Caregivers’ Speech in Early Word Learning. Proceedings of the 32nd Annual Cognitive Science Conference. Portland, Oregon. pdf 
  • Jeff Orkin, Tynan Smith, Hilke Reckman, and Deb Roy. (2010). Semi-Automatic Task Recognition for Interactive Narratives with EAT & RUN. Proceedings of the 3rd Intelligent Narrative Technologies Workshop at the 5th International Conference on Foundations of Digital Games (FDG), Monterey, CA. pdf 
  • Jeff Orkin and Deb Roy. (2010). Semi-Automated Dialogue Act Classification for Situated Social Agents in Games. Proceedings of the Agents for Games & Simulations Workshop at the 9th International Conference on Autonomous Agents and Multiagent Systems (AAMAS), Toronto, Canada. pdf 
  • Soroush Vosoughi, Brandon C. Roy, Michael C. Frank, and Deb Roy. (2010). Effects of Caregiver Prosody on Child Language Acquisition. Proceedings of the 5th International Conference on Speech Prosody. Chicago, IL. pdf 
  • Thomas Kollar, Stefanie Tellex, Deb Roy, and Nick Roy. (2010). Toward Understanding Natural Language Directions. Proceedings of Human Robot Interaction Conference 2010 (HRI-2010). Osaka, Japan. pdf 
  • Philipp Robbel and Deb Roy. (2010). Exploiting feature dynamics for active object recognition. 11th International Conference on Control Automation Robotics & Vision (ICARCV), 7 pages. pdf 
  • Jeff Orkin and Deb Roy. (2010). Toward an interleaved model of actions and words in social simulation. Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems, 2 pages. pdf 


  • Stefanie Tellex and Deb Roy. (2009). Grounding Spatial Prepositions for Video Search. Proceedings of the Eleventh International Conference on Multimodal Interfaces (ICMI-2009). Cambridge, MA. pdf Machine Learning Natural Language Processing
  • Deb Roy. (2009). New Horizons in the Study of Child Language Acquisition.Proceedings of Interspeech 2009. Brighton, England. pdf
  • Brandon C. Roy and Deb Roy. (2009). Fast transcription of unstructured audio recordings. Proceedings of Interspeech 2009. Brighton, England.pdf 
  • Rony Kubat, Daniel Mirman and Deb Roy. (2009). Semantic context effects on color categorization. Proceedings of the 31st Annual Cognitive Science Society Meeting. pdf 
  • Brandon C. Roy, Michael C. Frank and Deb Roy. (2009). Exploring word learning in a high-density longitudinal corpus. Proceedings of the 31st Annual Meeting of the Cognitive Science Society. pdf 
  • Philip DeCamp and Deb Roy. (2009). A Human-Machine Collaborative Approach to Tracking Human Movement in Multi-Camera Video. Proceedings of the 2009 International Conference on Content-based Image and Video Retrieval (CIVR). pdf 
  • Stefanie Tellex and Deb Roy. (2009). Towards Surveillance Video Search by Natural Language Query. Proceedings of the ACM International Conference on Image and Video Retrieval. pdf 
  • Jeff Orkin and Deb Roy. (2009). Automatic Learning and Generation of Social Behavior from Collective Human Gameplay. Proceedings of the 8th International Conference on Autonomous Agents and Multiagent Systems (AAMAS). pdf 
  • Kai-yuh Hsiao, and Soroush Vosoughi. (2009). The Object Schema Model and Situational Contextpdf 
  • David Lazer, Alex Sandy Pentland, Lada Adamic, Sinan Aral, Albert Laszlo Barabasi, Devon Brewer, Nicholas Christakis, Noshir Contractor, James Fowler, Myron Gutmann, Tony Jebara, Gary King, Michael Macy, Deb Roy, and Marshall Van Alstyne. (2009). Life in the network: the coming age of computational social science. Science (New York, NY) 323, no. 5915.


  • Deb Roy. (2008). A Mechanistic Model of Three Facets of Meaning. Symbols, Embodiment, and Meaning, de Vega, Glenberg, and Graesser, eds.pdf 
  • Michael Fleischman and Deb Roy. (2008). Grounded Language Modeling for Automatic Speech Recognition of Sports Video. HLT/NAACL. Columbus, OH. pdf 
  • Kai-yuh Hsiao, Stefanie Tellex, Soroush Vosoughi, Rony Kubat, and Deb Roy. (2008). Object Schemas for Grounding Language in a Responsive Robot. Connection Science 20, 4 (Dec.2008), 253-276. pdf 
  • Kai-yuh Hsiao, Soroush Vosoughi, Stefanie Tellex, Rony Kubat, and Deb Roy. (2008). Object Schemas for Responsive Robotic Language Use. Proceedings of the 3rd ACM/IEEE International Conference on Human-Robot Interaction. pdf 
  • Angelo Cangelosi, Tony Belpaeme, Giulio Sandini, Giorgio Metta, Luciano Fadiga, Gerhard Sagerer, Katherina Rohlfing, Britta Wrede, Stefano Nolfi, Domenico Parisi, Chrystopher Nehaniv, Kerstin Dautenhahn, Joe Saunders, Kerstin Fischer, Jun Tani, and Deb Roy. (2008). The ITALK project: Integration and Transfer of Action and Language Knowledge in Robots. Proceedings of Third ACM/IEEE International Conference on Human Robot Interaction (HRI), 2 pages. pdf 


  • Jeff Orkin and Deb Roy. (2007). The Restaurant Game: Learning Social Behavior and Language from Thousands of Players Online. Journal of Game Development, 3(1), 39-60. pdf 
  • Rony Kubat, Philip DeCamp, Brandon Roy, and Deb Roy. (2007). TotalRecall: Visualization and Semi-Automatic Annotation of Very Large Audio-Visual Corpora.Ninth International Conference on Multimodal Interfaces (ICMI 2007). pdf 
  • Michael Fleischman and Deb Roy. (2007). Unsupervised Content-Based Indexing of Sports Video Retrieval. 9th ACM Workshop on Multimedia Information Retrieval (MIR). Augsburg, Germany. pdf 
  • Michael Fleischman, Brandon Roy, and Deb Roy. (2007). Temporal Feature Induction for Baseball Highlight Classification. ACM Multimedia Conference. Augsburg, Germany. pdf 
  • Peter Gorniak and Deb Roy. (2007). Situated Language Understanding as Filtering Perceived Affordances. Cognitive Science, 31(2), 197-231.pdf 
  • Michael Fleischman and Deb Roy. (2007). Situated Models of Meaning for Sports Video Retrieval. HLT/ACL 2007, Rochester, NY. pdf
  • Stefanie Tellex and Deb Roy. (2007). Grounding Language in Spatial Routines. AAAI 2007 Spring Symposia on Control Mechanisms for Spatial Knowledge Processing in Cognitive / Intelligent Systems, Stanford University, Palo Alto CA. pdf 
  • Michael Levit and Deb Roy. (2007). Interpretation of Spatial Language in a Map Navigation Task. IEEE Transactions on Systems, Man, and Cybernetics, Part B, 37(3), 667-679. pdf 
  • Michael Fleischman and Deb Roy. (2007). Representing Intentions in a Cognitive Model of Language Acquisition: Effects of Phrase Structure on Situated Verb Learning. AAAI Spring Symposium: Intentions in Intelligent Systems, 6 pages. pdf 


  • Michael Fleischman, Philip DeCamp, and Deb Roy. (2006). Mining Temporal Patterns of Movement for Video Content Classification. Proceedings of the 8th ACM SIGMM International Workshop on Multimedia Information Retrieval. pdf
  • Deb Roy, Rupal Patel, Philip DeCamp, Rony Kubat, Michael Fleischman, Brandon Roy, Nikolaos Mavridis, Stefanie Tellex, Alexia Salata, Jethran Guinness, Michael Levit, Peter Gorniak. (2006). The Human Speechome Project. Proceedings of the 28th Annual Cognitive Science Conference.pdf 
  • Peter Gorniak and Deb Roy. (2006). Perceived Affordances as a Substrate for Linguistic Concepts. Twenty-eighth Annual Meeting of the Cognitive Science Society, 6 pages. pdf 
  • Peter Gorniak, Jeff Orkin, and Deb Roy. (2006). Speech, Space and Purpose: Situated Language Understanding in Computer Games. Twenty-eighth Annual Meeting of the Cognitive Science Society Workshop on Computer Games. pdf 
  • Nikolaos Mavridis and Deb Roy. (2006). Grounded Situation Models for Robots: Where Words and Percepts Meet. Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). pdf 
  • Dong Zhang, Daniel Gatica-Perez, Deb Roy, Samy Bengio. (2006). Modeling Interactions from Email Communication. IEEE International Conference on Multimedia & Expo (ICME). pdf 
  • Stefanie Tellex and Deb Roy. (2006). Spatial Routines for a Simulated Speech-Controlled Vehicle. Proceedings of Human Robot Interaction Conference 2006 (HRI-2006). pdf 


  • Philip DeCamp, Amber Frid-Jimenez, Jethran Guiness and Deb Roy. (2005). Gist Icons: Seeing Meaning in Large Bodies of Literature. IEEE Info Visualization 2005 Conference. pdf 
  • Kai-yuh Hsiao and Deb Roy. (2005). A Habit System for an Interactive Robot.AAAI Fall Symposium 2005: From Reactive to Anticipatory Cognitive Embodied Systems. pdf 
  • Peter Gorniak and Roy (2005). Probabilistic Grounding of Situated Speech using Plan Recognition and Reference Resolution. Seventh International Conference on Multimodal Interfaces (ICMI 2005). Best Paper Award. pdf 
  • Michael Fleischman and Deb Roy. (2005). Intentional Context in Situated Language Learning. Ninth Conference on Computational Natural Language Learning. pdf 
  • Nick Mavridis and Deb Roy. (2005). Grounded Situation Models for Robots: Bridging language, Perception, and Action. AAAI-05 Workshop on Modular Construction of Human-Like Intelligence. pdf 
  • Deb Roy. (2005). Semiotic Schemas: A Framework for Grounding Language in Action and Perception. Artificial Intelligence, 167(1-2):170-205.pdf 
  • Michael Fleischman and Deb Roy. (2005). Why are verbs harder to learner than nouns? Initial insights from a computational model of situated word learning. 27th Annual Meeting of the Cognitive Science Society. pdf 
  • Deb Roy. (2005). Grounding words in perception and action: computational insights. Trends in Cognitive Science, 9(8), 389-396. pdf
  • Kai-yuh Hsiao, Peter Gorniak, and Deb Roy. (2005). NetP: A Network API for Building Heterogeneous Modular Intelligent Systems. Proceedings of AAAI 2005 Workshop in Modular Construction of Human-Like Intelligence, pdf 
  • Peter Gorniak and Deb Roy. (2005). Speaking with your Sidekick: Understanding Situated Speech in Computer Role Playing Games. Proceedings of Artificial Intelligence and Interactive Digital Entertainment, 2005. pdf 
  • Deb Roy and Niloy Mukherjee. (2005). Towards Situated Speech Understanding: Visual Context Priming of Language Models. Computer Speech and Language, 19(2), pages 227-248. pdf 
  • Deb Roy and Ehud Reiter. (2005). Connecting Language to the World. Artificial Intelligence, 167(1-2): 1-12. pdf 
  • Anna V. Fisher, Michael Fleischman, Deb Roy, and Vladimir M. Sloutsky. (2005). Effects of Category Labels on Induction and Visual Processing: Support or Interference? Twenty-seventh Annual Meeting of the Cognitive Science Society, 6 pages. pdf 
  • Deb Roy. (2005). Grounding language in the world: Schema theory meets semiotics. Special Issue of Artificial Intelligence Journal: Connecting Language to the World. 
  • Dong Zhang, Daniel Gatica-Perez, Samy Bengio, and Deb Roy. (2005). Learning Influence among Interacting Markov Chains. Neural Information Processing Systems (NIPS), 8 pages. pdf 
  • Deb Roy and Niloy Mukherjee. (2005). Visual Context Driven Semantic Priming of Speech Recognition and Understanding. Computer Speech and Language., 19(2): 227-248. pdf 


  • Joshua Juster and Deb Roy. (2004). Elvis: Situated Speech and Gesture Understanding for a Robotic Chandelier. Proc. Int. Conf. Multimodal Interfaces. pdf 
  • Deb Roy, Yair Ghitza, Jeff Bartelma, and Charlie Kehoe. (2004). Visual Memory Augmentation: Using Eye Gaze as an Attention Filter. Proceedings of the IEEE International Symposium on Wearable Computers. pdf 
  • Deb Roy, Kai-Yuh Hsiao, and Nikolaos Mavridis. (2004). Mental Imagery for a Conversational Robot. IEEE Transactions on Systems, Man, and Cybernetics, Part B, 34(3), 1374-1383. pdf
  • Peter Gorniak and Deb Roy. (2004). Grounded Semantic Composition for Visual Scenes. Journal of Artificial Intelligence Research, Volume 21, pages 429-470. pdf 
  • Deb Roy. (2004). 10x: Human-machine Symbiosis. BT Technology Journal, 22(4): 121-124. pdf 
  • Rosalind W Picard, Seymour Papert, Walter Bender, Bruce Blumberg, Cynthia Breazeal, David Cavallo, Tod Machover, Mitchel Resnick, Deb Roy, Carol Strohecker (2004). Affective learning—a manifesto. BT Technology Journal, 22(4): 253-269. pdf 
  • Peter Gorniak, Deb Roy. (2004). BISHOP|BLENDER: Spatially Grounded Language Understanding in 3D Modelling Software. In Proceedings of the NAACL, 2 pages. pdf 
  • Rupal Patel, Sam Pilato, and Deb Roy. (2004). Beyond Linear Syntax: An Image-Oriented Communication Aid. Journal of Assistive Technology Outcomes and Benefits, 1(1): 57-66 pdf 
  • Jeff Orkin, Deb Roy. (2004). Capturing and generating social behavior with the restaurant game. Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems. pdf 


  • Niloy Mukherjee and Deb Roy. (2003). A Visual Context-Aware Multimodal System for Spoken Language Processing. Proc. Eurospeech, 6 pages.pdf Cognitive Science Natural Language Processing
  • Peter Gorniak and Deb Roy. (2003). Augmenting User Interfaces with Adaptive Speech Commands. In Proceedings of the International Conference for Multimodal Interfaces. pdf 
  • Peter Gorniak and Deb Roy. (2003). A Visually Grounded Natural Language Interface for Reference to Spatial Scenes. In Proceedings of the International Conference for Multimodal Interfaces. pdf 
  • Brian Whitman, Deb Roy, Barry Vercoe. (2003). Learning Word Meanings and Descriptive Parameter Spaces from Music. In Proceedings of the HLT-NAACL03 workshop on Learning Word Meaning from Non-Linguistic Data. pdf 
  • Kai-yuh Hsiao, Nikolaos Mavridis, Deb Roy. Coupling Perception and Simulation: Steps Towards Conversational Robotics. (2003). The Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systemspdf 
  • Deb Roy, Kai-yuh Hsiao, Nikolaos Mavridis. (2003). Conversational Robots: Building Blocks for Grounding Word Meanings. In Proceedings of the HLT-NAACL03 Workshop on Learning Word Meaning from Non-Linguistic Data. pdf 
  • Deb Roy. (2003). Grounded Spoken Language Acquisition: Experiments in Word Learning. IEEE Transactions on Multimedia, 5(2): 197-209.pdf 
  • Deb Roy, Kai-Yuh Hsiao, Nikolaos Mavridis, and Peter Gorniak. (2003). Ripley, Hand Me The Cup!(Sensorimotor representations for grounding word meaning). International Conference of Automatic Speech Recognition and Understanding, 6 pages. pdf 
  • Peter Gorniak and Deb Roy. (2003). Understanding Complex Visually Referring Utterances. NAACL Workshop on Word Meaning. pdf


  • Deb Roy, Peter Gorniak, Niloy Mukherjee, and Josh Juster. (2002). A Trainable Spoken Language Understanding System for Visual Object Selection. In Proceedings of the International Conference of Spoken Language Processing. pdf 
  • Deb Roy. (2002). A Trainable Visually-Grounded Spoken Language Generation System. In Proceedings of the International Conference of Spoken Language Processing. pdf 
  • Deb Roy. (2002). Learning Words and Syntax for a Visual Description Task. Computer Speech and Language. pdf 
  • Deb Roy, Kai-Yuh Hsiao, Peter Gorniak, and Niloy Mukherjee. (2002). Grounding natural spoken language semantics in visual perception and motor control. AAAI Technical Report FS-02-03. pdf 
  • Deb Roy and Alex Pentland. (2002). Learning Words from Sights and Sounds: A Computational Model. Cognitive Science, 26(1), 113-146. pdf
  • Ewa Dominowska, Deb Roy and Rupal Patel. (2002). An Adaptive Context-Sensitive Communication Aid. Proceedings for the 17th Annual International Conference “Technology and Persons with Disabilities”. pdf
  • Deb Roy. (2002). A System that Learns to Describe Objects in Visual Scenes. Proc. Seventh International Conference on Spoken Language Processing, 4 pages. pdf 
  • Deb Roy. (2002). Towards visually-grounded spoken language acquisition. Proc. Fourth International Conference on Multimodal Interfaces, 6 pages. pdf 


  • Deb Roy. (2001). Situation-Aware Spoken Language Processing. Royal Institute of Acoustics Workshop on Innovation in Speech Processing, Stratford-upon-Avon, England. pdf 
  • Fred Cummins and Deb Roy. (2001). Using Synchronous Speech to Minimize Variability. Royal Institute of Acoustics Workshop on Innovation in Speech Processing, Stratford-upon-Avon, England. pdf 


  • Deb Roy. (2000). Integration of Speech and Vision using Mutual Information. Int. Conf. Acoustics, Speech and Signal Processing. pdf
  • Deb Roy. (2000). Grounded Speech Communication. Proceedings of the International Conference on Spoken Language Processing, 5 pages. pdf
  • Deb Roy. (2000). A computational model of word learning from multimodal sensory input. Proceedings of the International Conference of Cognitive Modeling, 8 pages. pdf 


  • Deb Roy, Bernt Schiele, and Alex Pentland. (1999). Learning Audio-Visual Associations using Mutual Information. International Conference on Computer Vision, Workshop on Integrating Speech and Image Understanding. pdf 
  • Alex Pentland, Deb Roy, and Christopher Richard Wren. (1999). Perceptual intelligence: learning gestures and words for individualized, adaptive interfaces. Proceedings of the 8th International Conference on Human-Computer Interaction, 5 pages. 


  • Deb Roy and Alex Pentland. (1998). A Phoneme Probability Display for Individuals with Hearing Disabilities. Proceedings of the ACM Conference On Assistive Technologies, 4 pages. pdf 
  • Deb Roy and Alex Pentland. (1998). Learning words from natural audio-visual input. Fifth International Conference on Spoken Language Processing (ICSLP), 5 pages. pdf 
  • Rupal Patel and Deb Roy. (1998). Teachable interfaces for individuals with dysarthric speech and severe physical disabilities. Proceedings of the AAAI Workshop on Integrating Artificial Intelligence and Assistive Technology, 8 pages. pdf
  • Deb Roy and Alex Pentland. (1998). Word Learning in a Multimodal Environment, Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, 4 pages. pdf 


  • Deb Roy and Carl Malamud. (1997). Integration of a large text and audio corpus using speaker identification. Proceedings of the AAAI Spring Symposium on the Intelligent Integration and Use of Text, Image, Video and Audio Corpora. 
  • Deb Roy and Alex Pentland. (1997). Multimodal adaptive interfaces. MIT Media Laboratory Perceptual Computing Technical Report No. 438. pdf
  • Deb Roy and Carl Malamud. (1997). Speaker identification based test to audio alignment for an audio retrieval system. Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, Munich, Vol. 2, pp. 1099-1103. pdf 
  • Deb Roy. (1997). Speaker indexing using neural network clustering of vowel spectra. International Journal of Speech Technology, 1(2): 143-149.pdf 
  • Deb Roy, Nitin Sawhney, Chris Schmandt, and Alex Pentland. (1997). Wearable audio computing: A survey of interaction techniques. Perceptual Computing Technical Report No. 434, 9 pages. pdf


  • Chris Schmandt and Deb Roy. (1996). Using acoustic structure in a hand-held audio playback device. IBM Systems Journal, 35: (3-4). pdf
  • Deb Roy, Alex Pentland. (1996). Automatic spoken affect classification and analysis. Proc. Second International Conference on Automatic Face and Gesture Recognition, 6 pages. pdf 
  • Deb Roy and Chris Schmandt. (1996). NewsComm: A Hand-Held Interface for Interactive Access to Structured Audio. Proceedings of the ACM Conference on Computer Human Interaction, 9 pages. pdf


  • B. N. Chodirker, D. Roy, C. R. Greenberg, M. Cheang, J. A. Evans, & M. H. Reed. (1991). Computer assisted analysis of hand radiographs in infantile hypophosphatasia carriers. Pediatric Radiology, 21(3): 216-219. pdf



  • Soroush Vosoughi. (2015). Automatic Detection and Verification of Rumors on Twitter. Ph.D. Thesis, Massachusetts Institute of Technology.pdf 
  • Brandon C. Roy. (2013). The Birth of a Word. Ph.D. Thesis, Massachusetts Institute of Technology. pdf 
  • Jeff Orkin. (2013). Collective Artificial Intelligence: Simulated Role-Playing from Crowdsourced Data. Ph.D. Thesis, Massachusetts Institute of Technology. pdf 
  • Rony Kubat. (2012). Will They Buy? Ph.D. Thesis, Massachusetts Institute of Technology. pdf 
  • Philip DeCamp. (2012). Data Visualization in the First Person. Ph.D. Thesis, Massachusetts Institute of Technology. pdf with videos pdf
  • Stefanie Tellex. (2010). Natural Language and Spatial Reasoning. Ph.D. Thesis, Massachusetts Institute of Technology. pdf 
  • Michael Ben Fleischman. (2008). Grounding Language in Events. Ph.D. Thesis, Massachusetts Institute of Technology. pdf 
  • Kai-yuh Hsiao. (2007). Embodied Object Schemas for Grounding Language Use. Ph.D. Thesis, Massachusetts Institute of Technology. pdf
  • Peter Gorniak. (2005). The Affordance-Based Concept. Ph.D. Thesis, Massachusetts Institute of Technology. pdf 
  • Deb Roy. (1999). Learning from Sights and Sounds: A Computational Model. Ph.D. in Media Arts and Sciences, MIT. pdf 
  • M.Sc / M.Eng.
  • Pau Perng-Hwa Kung. (2016). Detecting and Analyzing Bursty Events on Twitter. M.Sc. Thesis, Massachusetts Institute of Technology. pdf
  • Sophie Chou. (2016). Reading Between the (Party) Lines: How Political News is Seen and Shared. M.Sc. Thesis, Massachusetts Institute of Technology. pdf 
  • Prashanth Vijayaraghavan. (2016). Automatic Identification of Representative Content on Twitter. M.Sc. Thesis, Massachusetts Institute of Technology. pdf 
  • Matthew Miller. (2011). Semantic Spaces: Behavior, Language and Word Learning in the Human Speechome Corpus. M.Sc. in Media Arts and Sciences Thesis. pdf 
  • George Macaulay Shaw. (2011). A Taxonomy of Situated Language in Natural Contexts. M.Sc. in Media Arts and Sciences Thesis. pdf
  • Kleovoulos Tsourides. (2010). Visually Grounded Virtual Accelerometers: A Longitudinal Video Investigation of Dyadic Bodily Dynamics around the time of Word Acquisition. M.Sc. in Media Arts and Sciences Thesis. pdf 
  • Soroush Vosoughi. (2010). Interactions of caregiver speech and early word learning in the Speechome Corpus: Computational Explorations. M.Sc. Thesis, Massachusetts Institute of Technology. pdf 
  • Sophia Yuditskaya. (2010). Automatic Vocal Recognition of a Child’s Perceived Emotional State within the Speechome Corpus. M.Sc. in Media Arts and Sciences Thesis. pdf 
  • Sheng-Ying Pao. (2010). Connected Strangers: Manipulating Social Perceptions to Study Trust. M.Sc. in Media Arts and Sciences Thesis. pdf
  • Philip DeCamp. (2007). HeadLock: Wide-Range Head Pose Estimation for Low Resolution Video. M.Sc. in Media Arts and Sciences Thesis.pdf 
  • Jeff Orkin. (2007). Learning Plan Networks in Conversational Video Games. M.Sc. in Media Arts and Sciences Thesis. pdf 
  • Philipp Robbel. (2007). Exploiting Object Dynamics for Recognition and Control. M.Sc. in Media Arts and Sciences Thesis. pdf 
  • Brandon Roy. (2007). Human-Machine Collaboration for Rapid Speech Transcription. M.Sc. in Media Arts and Sciences Thesis. pdf
  • Andre Ribeiro. (2005). Graph Dynamics: Learning and Representation. M.Sc. in Media Arts and Sciences Thesis. pdf
  • Charles Kehoe. (2005). Indexical Grounding for a Mobile Robot. M.Eng. EECS Thesis. pdf 
  • Niloy Mukherjee. (2003). Spontenous Speech Recognition Using Visual Context-Aware Language Models. M.Sc. in Media Arts and Sciences Thesis. pdf 
  • Sheel Sanjay Dhande. (2003). A Computational Model to Connect Gestalt Perception and Natural Language. M.Sc. in Media Arts and Sciences Thesis. pdf 
  • Ewa Dominowska. (2002). A Communication Aid with Context-Aware Vocabulary Prediction. M.Eng. EECS Thesis. pdf
  • Norimasa Yoshida. (2002). Automatic Utterance Segmentation in Spontaneous Speech. M.Eng. EECS Thesis. pdf 
  • Ben Yoder. (2001). Spontaneous Speech Recognition Using Hidden Markov Models. M.Eng. EECS Thesis. pdf 
  • Deb Roy. (1995). NewsComm: A Hand-Held Device for Interactive Access to Structured Audio. M.Sc. in Media Arts and Sciences, MIT. pdf

Non-Academic Writing



  • Naigles, L. R., Chin, I., Vosoughi, S., Goodwin, M., & Roy, D. (2011). Final Report for 3R01DC007428 – 04S1 Collaborative research supplement and infrastructure and research equipment for advancement of sciencepdf