Combining data-driven natural language processing techniques with traditional methods using predefined word lists may offer greater insights into the connections between language patterns and depression and anxiety symptoms, particularly within specific stressful contexts. Between 2020 and 2021, 1106 participants wrote narrative responses describing their experiences during the COVID-19 pandemic and completed the Depression Anxiety Stress Scale-21 (DASS). We investigated language patterns associated with DASS symptoms using established categories from Linguistic Inquiry and Word Count (LIWC) and sentiment analysis, as well as exploratory natural language processing techniques. Finally, we constructed machine learning regression models in order to assess how much of the variance in DASS symptoms is related to language use. We found significant positive bivariate correlations between total DASS symptoms and hypothesized LIWC categories: first-person singular pronouns, absolute language, and negative emotion words. These results remained largely similar when using negative sentiment scores and when statistically controlling for gender, age, and education. Exploratory n-gram analyses also revealed new individual words and phrases correlated with total DASS symptoms. Lastly, our regression models demonstrated a significant association between language use and total DASS symptoms (R2 = 0.36-0.62). The current study is one of the first to examine associations between language use and DASS symptoms during the pandemic using both traditional and data-driven techniques. These results replicate and extend prior findings regarding negative emotion and absolute language and identify unique correlates of DASS symptoms during pandemic-related stress, contributing to the literature on language and mental health more broadly.
Read full abstract