Abstract

Detection of suicide risk is a highly prioritized, yet complicated task. Five decades of research have produced predictions slightly better than chance (AUCs = 0.56–0.58). In this study, Artificial Neural Network (ANN) models were constructed to predict suicide risk from everyday language of social media users. The dataset included 83,292 postings authored by 1002 authenticated Facebook users, alongside valid psychosocial information about the users. Using Deep Contextualized Word Embeddings for text representation, two models were constructed: A Single Task Model (STM), to predict suicide risk from Facebook postings directly (Facebook texts → suicide) and a Multi-Task Model (MTM), which included hierarchical, multilayered sets of theory-driven risk factors (Facebook texts → personality traits → psychosocial risks → psychiatric disorders → suicide). Compared with the STM predictions (0.621 ≤ AUC ≤ 0.629), the MTM produced significantly improved prediction accuracy (0.697 ≤ AUC ≤ 0.746), with substantially larger effect sizes (0.729 ≤ d ≤ 0.936). Subsequent content analyses suggested that predictions did not rely on explicit suicide-related themes, but on a range of text features. The findings suggest that machine learning based analyses of everyday social media activity can improve suicide risk predictions and contribute to the development of practical detection tools.

Highlights

  • Detection of suicide risk is a highly prioritized, yet complicated task

  • We report on research showing that the combination of psychological knowledge, advanced machine learning techniques, and natural language processing (NLP) methods can considerably improve suicide risk predictions

  • The average prediction performances of the Single Task Model (STM) was significantly higher than chance level (AUC of 0.5), both for general risk [Area Under the ROC Curve (AUC) = 0.621, 95% CI: 0.576, 0.657] and for high suicide risk [AUC = 0.629, 95% CI 0.606, 0.660]

Read more

Summary

Introduction

Detection of suicide risk is a highly prioritized, yet complicated task. Five decades of research have produced predictions slightly better than chance (AUCs = 0.56–0.58). A recent study, for example, managed to develop a highly accurate suicide prediction model (0.769 ≤ AUC ≤ 0.792), based on the health records of patients who visited one of the Berkshire Health System h­ ospitals[8] Valuable, these sources do not capture first-hand the patients’ natural behavior, nor do they include data from non-treated or non-diagnosed individuals. The popularity of social media platforms, such as Facebook or Twitter, has created unprecedented opportunities to mine large data sets of everyday, user-generated content for patterns of communication that could be indicative of various mental health conditions Research in this field has been bountiful in the case of depressive ­disorders[11,12,13]. Together with earlier works on non-digital communication formats

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call