Less is More: Stress Detection through Condensed Social Media Contents

Zeyad Alghamdi,Tharindu Kumarage,Huan Liu,H Russell Bernard,Garima Agrawal

doi:10.34190/ecsm.11.1.2273

Abstract

In the digital age, social media has been a go-to platform for stress-related discussions, yielding valuable data to advance the understanding and detection of stress. Swift identification of stress indicators in these online conversations is essential in enabling immediate support and helping to avert subsequent severe mental and physical health issues, especially during global crises such as pandemics and conflicts. Detecting stress in social media posts automatically poses a formidable challenge. While techniques such as supervised Pretrained Language Models (PLMs) and zero-shot Large Language Models (LLMs) based classifiers have demonstrated significant performance, they exhibit limitations, especially on platforms like Reddit. For example, on Reddit, users tend to write lengthy, expressive posts, which causes these methods to often fail to consider the entire context, leading to incomplete or inaccurate assessments of a user's mental health or stress status. To overcome these limitations, we present a new approach to identifying and classifying stress-related discourse on social media. Our approach involves analyzing condensed versions of user posts, such as user-provided summaries or the "Too Long Didn’t Read" (TLDR) portion of the original post. We question whether these abridged texts can yield a more accurate classification of stress. In this paper, we make the following contributions. First, we investigate the relationship between the performance of the model's perceived textual context and the length of social media posts. Second, we present a novel approach to use the summarized texts for stress detection. We experiment with different classifiers to evaluate their performance on stress detection accuracy using summarized versus full-length posts. Furthermore, by examining the emotional and linguistic features of the original posts and their summaries, we suggest improvements to current state-of-the-art LLM-based stress classifier prompts, thereby enhancing stress detection capabilities. Finally, when user summaries are absent, we synthetically generate meaningful user post summaries by incorporating the power of LLMs. Our results show that the stress detection performance deteriorates for longer posts, and utilizing the TLDR and summaries improves classification outcomes. We also provide augmented datasets containing human and AI-generated summaries for future research in stress detection on social media.

Full Text