Personalized Modeling of Real-World Vocalizations from Nonverbal Individuals

Narain, J.*, Johnson, K.T.*, Ferguson, C., O’Brien, A., Talkar, T., Zhang, Y., Wofford, P., Quatieri, T., Picard, R.W.,Maes, P., "Personalized Modeling of Real-World Vocalizations from Nonverbal Individuals," Proceedings of the International Conference on Multimodal Interaction (ICMI), Utrecht, Netherlands, October 2020. (*Co-first authors/Equal contribution)


Nonverbal vocalizations contain important affective and communicative information, especially for those who do not use traditional speech, including individuals who have autism and are non- or minimally verbal (nv/mv). Although these vocalizations are often understood by those who know them well, they can be challenging to understand for the community-at-large. This work presents (1) a methodology for collecting spontaneous vocalizations from nv/mv individuals in natural environments, with no researcher present, and personalized in-the-moment labels from a family member; (2) speaker-dependent classification of these real-world sounds for three nv/mv individuals; and (3) an interactive application to translate the nonverbal vocalizations in real time. Using support-vector machine and random forest models, we achieved speaker-dependent unweighted average recalls (UARs) of 0.75, 0.53, and 0.79 for the three individuals, respectively, with each model discriminating between 5 nonverbal vocalization classes. We also present first results for real-time binary classification of positive- and negative-affect nonverbal vocalizations, trained using a commercial wearable microphone and tested in real time using a smartphone. This work informs personalized machine learning methods for non-traditional communicators and advances real-world interactive augmentative technology for an underserved population.

Related Content