Publication

Automatic Triage and Analysis of Online Suicide Risk with Document Embeddings and Latent Dirichlet Allocation

Jones, N., Jaques, N., Pataranutaporn, P., Ghandeharioun, A., & Picard, R. (2019, September). Analysis of Online Suicide Risk with Document Embeddings and Latent Dirichlet Allocation. In 2019 8th International Conference on Affective Computing and Intelligent Interaction Workshops and Demos (ACIIW) (pp. 1-5). IEEE.

Abstract

Machine learning is applied to a dataset of the suicidality of Reddit users in which the suicide risk labels were derived from knowledge of expert clinicians. We present the results of machine learning models based on transfer learning from document embeddings trained on large external corpora, and find that they have very high F1 scores (:83 􀀀 :92) in distinguishing which users are most at risk of committing suicide. Thus, these models could potentially provide valuable aid in triaging care for individuals most in danger. We compare the document embedding approach with one which incorporates expert domain knowledge. Word importance is assessed as a way of suggesting signs that could indicate suicide risk in online posts. Finally, we learn a Latent Dirichlet Allocation (LDA) topic model and find that suicidal users post about different topics to the rest of Reddit than non-suicidal users. 

Related Content