Suicidal Ideation Detection and Influential Keyword Extraction from Twitter using Deep Learning (SID)

Xie-Yi G

doi:10.4108/eetpht.10.6042

Abstract

INTRODUCTION: This paper focuses on building a text analytics-based solution to help the suicide prevention communities to detect suicidal signals from text data collected from online platform and take action to prevent the tragedy. OBJECTIVES: The objective of the paper is to build a suicide ideation detection (SID) model that can classify text as suicidal or non-suicidal and a keyword extractor to extracted influential keywords that are possible suicide risk factors from the suicidal text. METHODS: This paper proposed an attention-based Bi-LSTM model. An attention layer can assist the deep learning model to capture influential keywords of the model classifying decisions and hence reflects the important keywords from text which highly related to suicide risk factors or reason of suicide ideation that can be extracted from text. RESULTS: Bi-LSTM with Word2Vec embedding have the highest F1-score of 0.95. Yet, attention-based Bi-LSTM with word2vec embedding that has 0.94 F1-score can produce better accuracy when dealing with new and unseen data as it has a good fit learning curve. CONCLUSION: The absence of a systematic approach to validate and examine the keyword extracted by the attention mechanism and RAKE algorithm is a gap that needed to be resolved. The future work of this paper can focus on both systematic and standard approach for validating the accuracy of the keywords.

Full Text