Prediction and Analysis of Degree of Suicidal Ideation in Online Content

Jones, Noah. Prediction and Analysis of Degree of Suicidal Ideation in Online Content. 2020. MIT, SM Thesis.


Machine learning (ML) has increasingly been used to address the growing burden of mental illness and lack of access to quality mental health care. Recently such models have been applied to online data, such as social media postings to augment mental health screening. Despite the potential of these methods, online ML classifiers still perform poorly in multi-class settings. In this thesis, we propose the usage of novel document embeddings and mental health based user embeddings for triaged suicide risk screening. Machine learning to infer suicide risk and urgency is applied to a dataset of Reddit users in which the risk and urgency labels were derived from crowdsource consensus. We show that the document embedding approach outperforms count-based baselines and a method based on word importance, where important words were identified by domain experts. We examine interpretable features and methods that help to discern and explain risk labels. Finally, we find, using a Latent Dirichlet Allocation (LDA) topic model, that users labeled at-risk for suicide post about different topics to the rest of Reddit than non-suicidal users.

Related Content