• Login
  • Register

Work for a Member organization and need a Member Portal account? Register here with your official email address.

Publication

Identifying Vocal and Facial Biomarkers of Depression in Large-Scale Remote Recordings: A Multimodal Study Using Mixed-Effects Modeling

Hidalgo Julia, N.*, Lewis, R., Ferguson*, C., Goldberg, S., Lau, W., Swords, C., Valdivia, G., Wilson-Mendenhall, C., Tartar, R., Picard, R., Davidson, R. (2025) Identifying Vocal and Facial Biomarkers of Depression in Large-Scale Remote Recordings: A Multimodal Study Using Mixed-Effects Modeling. Proc. Interspeech 2025, 5263-5267, doi: 10.21437/Interspeech.2025-2560

Abstract

 We examine vocal and facial data from a new study with n=954 depressed participants, each characterized by six time points of the eight-item Patient Health Questionnaire survey (PHQ-8). Patients interacted with a smartphone app over four weeks, with a 3-month follow-up. The app's animated character asked participants to describe, for 90 seconds, an emotional experience from the past 24 hours. We obtained 4,875 audio-video recordings, and applied linear mixed-effects models to examine associations between depression severity and 30 acoustic, linguistic and facial action unit features. Significant associations were found with speech timing and prosody, voice quality, linguistic sentiment, the use of self-referential pronouns, and facial action units related to smiling. We also show that these features allow accurate estimation of depression severity in multimodal mixed-effects machine learning models.

Related Content