Linguistic Insights Research Articles

BackgroundAccording to a World Health Organization report in 2017, there was almost one patient with depression among every 20 people in China. However, the diagnosis of depression is usually difficult in terms of clinical detection owing to slow observation, high cost, and patient resistance. Meanwhile, with the rapid emergence of social networking sites, people tend to share their daily life and disclose inner feelings online frequently, making it possible to effectively identify mental conditions using the rich text information. There are many achievements regarding an English web-based corpus, but for research in China so far, the extraction of language features from web-related depression signals is still in a relatively primary stage.ObjectiveThe purpose of this study was to propose an effective approach for constructing a depression-domain lexicon. This lexicon will contain language features that could help identify social media users who potentially have depression. Our study also compared the performance of detection with and without our lexicon.MethodsWe autoconstructed a depression-domain lexicon using Word2Vec, a semantic relationship graph, and the label propagation algorithm. These two methods combined performed well in a specific corpus during construction. The lexicon was obtained based on 111,052 Weibo microblogs from 1868 users who were depressed or nondepressed. During depression detection, we considered six features, and we used five classification methods to test the detection performance.ResultsThe experiment results showed that in terms of the F1 value, our autoconstruction method performed 1% to 6% better than baseline approaches and was more effective and steadier. When applied to detection models like logistic regression and support vector machine, our lexicon helped the models outperform by 2% to 9% and was able to improve the final accuracy of potential depression detection.ConclusionsOur depression-domain lexicon was proven to be a meaningful input for classification algorithms, providing linguistic insights on the depressive status of test subjects. We believe that this lexicon will enhance early depression detection in people on social media. Future work will need to be carried out on a larger corpus and with more complex methods.

Read full abstract

Addressing differential item functioning (DIF) provides validity evidence to support the interpretation of test scores across groups. Conventional DIF methods flag DIF items statistically, but often fail to consolidate a substantive interpretation. The lack of interpretability of DIF results is particularly pronounced in writing assessment where the matching of test takers’ proficiency levels often relies on external variables and the reported DIF effect is frequently small in magnitude. Using responses to a prompt that showed small gender DIF favoring female test takers, we demonstrate a corpus-based approach that helps address DIF interpretation. To provide linguistic insights into the possible sources of the small DIF effect, this study compared a gender-balanced corpus of 826 writing samples matched by test takers’ performance on the reading and listening components of the test. Four groups of linguistic features that correspond to the rating dimensions, and thus partially represent the writing construct were analyzed. They include (1) sentiment and social cognition, (2) cohesion, (3) syntactic features, and (4) lexical features. After initial screening, 123 linguistic features, all of which were correlated with the writing scores, were retained for gender comparison. Among these selected features, female test takers’ writing samples scored higher on six of them with small effect sizes in the categories of cohesion and syntactic features. Three of the six features were positively correlated with higher writing scores, while the other three were negative. These results are largely consistent with previous findings of gender differences in language use. Additionally, the small differences in the language features of the writing samples (in terms of the small number of features that differ between genders and the small effect size of the observed differences) are consistent with the previous DIF results, both suggesting that the effect of gender differences on the writing scores is likely to be very small. In sum, the corpus-based findings provide linguistic insights into the gender-related language differences and their potential consequences in a testing context. These findings are meaningful for furthering our understanding of the small gender DIF effect identified through statistical analysis, which lends support to the validity of writing scores.

Read full abstract

Linguistic Insights Research Articles

Related Topics

Articles published on Linguistic Insights

QuestionComb: A Gamification Approach for the Visual Explanation of Linguistic Phenomena through Interactive Labeling

Does think mean the same thing as believe? Linguistic insights into religious cognition.

Discourse Relations and Connectives in Higher Text Structure

A Discursive Analysis of Crisis Response Strategies in CEO Apologies—Drawing on Linguistic Insights from the Appraisal Framework

Discussion: multiple approaches in CLIL: cognitive, affective and linguistic insights

Diachronic Distribution of Elemental Ordering in English

A linguist's perspective on teaching communication disorders

The composite scanning method (urbslingua)

Covid-19 mõju liivi keele omandamise võimalustele

Cognitive Linguistics and its tools in studying conceptual metaphor: research and practice in ESP

O'ZBEK TILIDAGI TILSHUNOSLIK TERMINLARINING MOBIL ILOVASINI YARATISH

Klaipėdos krašto šiaurinės dalies nausėdijos XVI‒XVIII a. vietovardžiuose ir asmenvardžiuose | Oikonyms and Personal Names as a Source for Determining New Settlements in the Northern Part of the Klaipėda (Memel) Region: The 16th to the 18th Centuries

Brain, mind and linguistic processing insights into the dynamic nature of bilingualism and its outcome effects

Conciliation and Conflict in the Meccan and Medinan Qur'an: A Thematic Study of Suras 6 to 9

Automatic generation of lexica for sentiment polarity shifters

Automatic Construction of a Depression-Domain Lexicon Based on Microblogs: Text Mining Study.

Using Corpus Analyses to Help Address the DIF Interpretation: Gender Differences in Standardized Writing Assessment.

The Contribution of Cognitive Linguistics to the Acquisition of Polysemy: A Dictionary Entry-Based Study with Chinese Learners of English

A Linguistic Insight into the Legislative Drafting of English-Speaking Jurisdictions

‘Hakken en plakken’

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Linguistic Insights Research Articles

Related Topics

Articles published on Linguistic Insights

QuestionComb: A Gamification Approach for the Visual Explanation of Linguistic Phenomena through Interactive Labeling

Does think mean the same thing as believe? Linguistic insights into religious cognition.

Discourse Relations and Connectives in Higher Text Structure

A Discursive Analysis of Crisis Response Strategies in CEO Apologies—Drawing on Linguistic Insights from the Appraisal Framework

Discussion: multiple approaches in CLIL: cognitive, affective and linguistic insights

Diachronic Distribution of Elemental Ordering in English

A linguist's perspective on teaching communication disorders

The composite scanning method (urbslingua)

Covid-19 mõju liivi keele omandamise võimalustele

Cognitive Linguistics and its tools in studying conceptual metaphor: research and practice in ESP

O'ZBEK TILIDAGI TILSHUNOSLIK TERMINLARINING MOBIL ILOVASINI YARATISH

Klaipėdos krašto šiaurinės dalies nausėdijos XVI‒XVIII a. vietovardžiuose ir asmenvardžiuose | Oikonyms and Personal Names as a Source for Determining New Settlements in the Northern Part of the Klaipėda (Memel) Region: The 16th to the 18th Centuries

Brain, mind and linguistic processing insights into the dynamic nature of bilingualism and its outcome effects

Conciliation and Conflict in the Meccan and Medinan Qur'an: A Thematic Study of Suras 6 to 9

Automatic generation of lexica for sentiment polarity shifters

Automatic Construction of a Depression-Domain Lexicon Based on Microblogs: Text Mining Study.

Using Corpus Analyses to Help Address the DIF Interpretation: Gender Differences in Standardized Writing Assessment.

The Contribution of Cognitive Linguistics to the Acquisition of Polysemy: A Dictionary Entry-Based Study with Chinese Learners of English

A Linguistic Insight into the Legislative Drafting of English-Speaking Jurisdictions

‘Hakken en plakken’