Abstract

In this paper, we detect the mental health status of the posters based on social media. Due to the wide range of textual data length in social media, we prove that the length of text input has an effect on the main deep learning architectures such as Convolutional Neural Network (CNN) and Recurrent Neural Network (RNN) used in text classification. We argue through three distinct indicators in the English standard dataset and Reddit mental illness dataset, including accuracy, model parameter complexity and time complexity, and set the maximum length of text input on the text classification models by changing the proportion of training/test dataset. As text grows, the accuracy of text classification algorithms also tends to increase, as does the time complexity and model parameter complexity overhead. The shorter maximum length of text results in lower accuracy, while the longer maximum text length leads to higher training costs of algorithms. It allows us to find that the common text classification algorithm models have shown significant influence on the standard English dataset and Reddit mental illness dataset. The length of text or a string, especially for controlling the maximum length of text input on the models separately, puts a strict limit on the text classification algorithm.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call