This study showed the technical feasibility of an automated system for robot perception of affective states and engagement of children undergoing autism therapy. We created a method that personalizes the “deep” learning of these affective states and engagement—a type of machine learning algorithms that mimic the activity seen in layers of neurons in the neocortex of the human brain. Using this method, we achieved an agreement of ~60% (in terms of commonly used agreement score named intra-class correlation) between robot-perceived levels of affect and engagement of the children, and their ratings provided by human experts. While people may be inconsistent in tracking and interpreting small behavioral changes, robots can excel at this and help therapists see when even slight progress is made.
We focused on valence (a pleasure–displeasure continuum) and arousal (alertness) dimensions of human affect, along with the task engagement, as these are important for accomplishing more naturalistic child-robot interactions. Typically, human experts (e.g. therapists) watch audio-visual recordings of the therapy sessions and provide continuous scores (-1 to 1) for valence, arousal and engagement levels of the child in the video. These are later used to learn our models for robot perception of affect and engagement.
The goal of this work was to design personalized machine learning algorithms that could enable robot perception of human behavioral cues in real-world settings. We showed that this is technically feasible, and look forward to continuing to improve upon the results.
Deep learning is a family of computational models that originated in 1980 but wasn’t feasible until recently due to hardware and other technical limitations. Deep learning has “dramatically improved the state-of-the-art in speech recognition, visual object recognition, object detection and many other domains such as drug discovery and genomics,” ("Deep Learning,” Nature 2015) but has not been extensively studied in the domain of human affect and robotics.
The main technological novelty of our work is the successful personalization of deep learning models to enable automatic perception of children’s affective states and engagement during robot-assisted autism therapy. In this work, we focused on the robot perception phase, with the goal of bringing the robot perception of affect and engagement closer to human perception of these states. What is unique to the algorithms that we designed is that they are “personalized,” which means that they can easily learn and adapt to different individuals—an especially important feature when developing assistive technology for individuals with autism.
Personalized machine learning refers to computer algorithms designed to automatically learn patterns from human data and use this knowledge to make inferences from new data of target subjects. We showed the potential of this framework using deep learning algorithms, as they allow us to account for hierarchical structures in the data (for example, users’ demographics), and also embed expert knowledge into the learning (for instance, the results of various cognitive tests). In the context of this study, we used the latest advances in deep learning to disentangle different sources of variance in behavioral data (body posture, facial expressions, tone of voice, and autonomic physiology) of children with autism to improve the robot perception of their affective states and engagement during interactions with the robot and therapist. By leveraging individual characteristics of each child (their gender, culture, and behavioral assessment scores provided by experts), these learning algorithms are able to adapt to very diverse behaviors and their expressions pertinent to children with autism.
The main bottleneck of traditional (generic) machine learning algorithms is that they are optimized on a population level, thus not offering an optimal performance for each individual in the target group. By contrast, the goal of personalized machine learning is to optimize outcomes (e.g., accuracy of mood prediction) for a specific person rather than the average group outcome. We achieved this by using personal information (demographics, preferences, etc.) and expert knowledge, in addition to behavioral data captured by various sensors (microphones, cameras and wearable devices), to augment existing and design new “personalized” learning algorithms.
The famous adage says: “If you have met one person with autism, you have met one person with autism.” Children with autism have very diverse and atypical ways of showing their behaviors and responding to social cues. This makes it particularly challenging for traditional machine learning to generalize across different children with autism. By using our personalized machine learning, we were able to enable the robot perception of affect and engagement in children under the study, and in the real-world environment where they interact with therapists.
This new technology can also support therapists and clinicians by helping them to unobtrusively monitor behavioral patterns of children with autism and track how these change over time. This, in turn, can help to design more naturalistic robot-assisted autism therapies, allowing the therapist working with children with autism to personalize the therapy content to specific needs of each child.
We also showed in a number of our recently published papers the benefits of personalized machine learning, spanning applications from mood monitoring and pain detection, to modeling of Alzheimer’s disease (AD) progression. For instance, we showed that by using personalized machine learning we can make accurate predictions of future changes (6, 12, 18, and 24 months) of the key biomarkers of AD.
In this study, we used the commercially available robot NAO (SoftBank Robotics). Previous studies on human-robot interaction found that children with autism can easily engage with this type of humanoid robot. It also allows users to easily program its emotion expressions (which was important for the therapy tasks in our study) that are conveyed via its limb motions, tone of voice, and eye colors (e.g., yellow for “happy” and red for “anger”). Other robot solutions are also feasible, and their use depends on specific aims of the therapy, depending on whether they are to be deployed as interactive or monitoring tools or both. Nevertheless, the personalized robot perception design that we created in our work is not robot-dependent and can be deployed within any robot solution that is equipped with video, audio, and wearable sensors.
There are many robot tools on the market claiming that they can be used for autism therapy. Our work is directed towards equipping these and other robots with social intelligence so that therapists, children with autism, and their parents can interact with this technology in a more engaging and productive way. While we foresee this technology being on the market in the near future, our work still requires more collaborations with clinicians and therapists before the developed technology can be deployed in everyday scenarios.
We are currently planning to expand our study and evaluate the use of the developed technology in large-scale clinical trials. Once we have all the details ready, we will put out a call for volunteers so that parents who would like their children to participate can join the trial.
The next big step is deploying the developed technology in clinical and therapy centers, and hospitals, in order to make it more accessible to experts working with children with autism on an everyday basis. This would also facilitate and improve the learning of our algorithms for robot perception, allowing us to achieve even better personalization of this technology to many children on the autism spectrum. For this, we are planning to collaborate with hospitals and robot companies who would help realize these goals and scale up faster, so that this technology could be used as part of a daily therapy in the near future.