Publication

Automatic Triage and Analysis of Online Suicide Risk with Document Embeddings and Latent Dirichlet Allocation

Sept. 7, 2019

Jones, N., Jaques, N., Pataranutaporn, P., Ghandeharioun, A., & Picard, R. (2019, September). Analysis of Online Suicide Risk with Document Embeddings and Latent Dirichlet Allocation. In 2019 8th International Conference on Affective Computing and Intelligent Interaction Workshops and Demos (ACIIW) (pp. 1-5). IEEE.

Abstract

Machine learning to infer suicide risk and urgency is applied to a dataset of Reddit users in which the risk and urgency labels were derived from crowdsource consensus. We present the results of machine learning models based on transfer learning from document embeddings trained on large external corpora, and find that they have very high F1 scores (.83 -. 92) in distinguishing which users are labeled as being most at risk of committing suicide. We further show that the document embedding approach outperforms a method based on word importance, where important words were identified by domain experts. Finally, we find, using a Latent Dirichlet Allocation (LDA) topic model, that users labeled at-risk for suicide post about different topics to the rest of Reddit than non-suicidal users.

Related Content