Abstract
BackgroundThe COVID-19 pandemic is impacting mental health, but it is not clear how people with different types of mental health problems were differentially impacted as the initial wave of cases hit.ObjectiveThe aim of this study is to leverage natural language processing (NLP) with the goal of characterizing changes in 15 of the world’s largest mental health support groups (eg, r/schizophrenia, r/SuicideWatch, r/Depression) found on the website Reddit, along with 11 non–mental health groups (eg, r/PersonalFinance, r/conspiracy) during the initial stage of the pandemic.MethodsWe created and released the Reddit Mental Health Dataset including posts from 826,961 unique users from 2018 to 2020. Using regression, we analyzed trends from 90 text-derived features such as sentiment analysis, personal pronouns, and semantic categories. Using supervised machine learning, we classified posts into their respective support groups and interpreted important features to understand how different problems manifest in language. We applied unsupervised methods such as topic modeling and unsupervised clustering to uncover concerns throughout Reddit before and during the pandemic.ResultsWe found that the r/HealthAnxiety forum showed spikes in posts about COVID-19 early on in January, approximately 2 months before other support groups started posting about the pandemic. There were many features that significantly increased during COVID-19 for specific groups including the categories “economic stress,” “isolation,” and “home,” while others such as “motion” significantly decreased. We found that support groups related to attention-deficit/hyperactivity disorder, eating disorders, and anxiety showed the most negative semantic change during the pandemic out of all mental health groups. Health anxiety emerged as a general theme across Reddit through independent supervised and unsupervised machine learning analyses. For instance, we provide evidence that the concerns of a diverse set of individuals are converging in this unique moment of history; we discovered that the more users posted about COVID-19, the more linguistically similar (less distant) the mental health support groups became to r/HealthAnxiety (ρ=–0.96, P<.001). Using unsupervised clustering, we found the suicidality and loneliness clusters more than doubled in the number of posts during the pandemic. Specifically, the support groups for borderline personality disorder and posttraumatic stress disorder became significantly associated with the suicidality cluster. Furthermore, clusters surrounding self-harm and entertainment emerged.ConclusionsBy using a broad set of NLP techniques and analyzing a baseline of prepandemic posts, we uncovered patterns of how specific mental health problems manifest in language, identified at-risk users, and revealed the distribution of concerns across Reddit, which could help provide better resources to its millions of users. We then demonstrated that textual analysis is sensitive to uncover mental health complaints as they appear in real time, identifying vulnerable groups and alarming themes during COVID-19, and thus may have utility during the ongoing pandemic and other world-changing events such as elections and protests.
Highlights
The ongoing outbreak of a novel coronavirus causing the disease COVID-19 is likely to have impacts on mental health as many individuals experience losses of income, social engagement, mobility, physical health, and uncertainty
Individual word stems are obtained from term frequency–inverse document frequency. bLIWC: Linguistic Inquiry and Word Count
Our findings suggest the pandemic may have induced health anxiety among several mental health and non–mental health communities given that posts on r/COVID19_support were classified most frequently as belonging to r/healthanxiety, midpandemic posts from the r/anxiety subreddit became significantly enriched for posts from the health anxiety cluster, latent dirichlet allocation (LDA) topic analysis found the health anxiety topic significantly http://www.jmir.org/2020/10/e22635/
Summary
The ongoing outbreak of a novel coronavirus causing the disease COVID-19 is likely to have impacts on mental health as many individuals experience losses of income, social engagement, mobility, physical health, and uncertainty. Characterizing these impacts is critical to motivate and inform the provision of appropriate therapeutic responses. We apply text processing and machine learning techniques to this data set to analyze COVID-19’s impacts on mental health discourse as a potential proxy for changes in mental health needs. The scope of likely mental health deterioration during this pandemic provides an unprecedented need to understand how different mental health cohorts are responding to the outbreak to best design patient assessments and allocate resources. The COVID-19 pandemic is impacting mental health, but it is not clear how people with different types of mental health problems were differentially impacted as the initial wave of cases hit
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.