Ruminative Thought Content Changes by Year: A Computational Linguistic Analysis
Ruminative Thought Content Changes by Year: A Computational Linguistic Analysis
- Conference Article
4
- 10.18653/v1/w18-1109
- Jan 1, 2018
The paper provides an outline of the scope for synergy between computational linguistic analysis and population stud-ies. It first reviews where population studies stand in terms of using social media data. Demographers are entering the realm of big data in force. But, this paper argues, population studies have much to gain from computational linguis-tic analysis, especially in terms of ex-plaining the drivers behind population processes. The paper gives two examples of how the method can be applied, and concludes with a fundamental caveat. Yes, computational linguistic analysis provides a possible key for integrating micro theory into any demographic analysis of social media data. But results may be of little value in as much as knowledge about fundamental sample characteristics are unknown.
- Supplementary Content
- 10.1080/14484528.2025.2462533
- Feb 13, 2025
- Life Writing
This essay is a collaborative work between an author/scholar and a linguistics scholar to navigate the terrain of writing about disability. It is a computational linguistic analysis of the memoir Head Above Water: Reflections on Illness. It is also a creative-critical reflection on the lack of illness narratives by Arab women authors. Conceptual metaphors, motion verbs, and linguistic patterns were examined in the memoir. The analysis revealed predominant conceptual structures, including ACTION IS MOTION and LIFE IS VERTICALITY, with a higher frequency of perceptual and cognitive verbs compared to motion verbs. Quantitative data showed sporadic distribution of ability-denoting modals in affirmative and negating sentences, reflecting a nuanced portrayal of resilience. The author's coinage of ‘random disability’ was identified as a linguistic coping mechanism, defying traditional abled/disabled categorisations. The computational analysis was corroborated by the author's reflections, revealing both conscious and unconscious linguistic patterns. This interdisciplinary approach combines objective computational and cognitive linguistic analysis with subjective authorial insight, offering a novel methodology for examining illness narratives. It integrates a linguistic outsider's perspective, i.e. attempting to systematically address a text using theories and tools from linguistics, on a memoir with a self-reflection from the author to enhance its cognitive linguistic reading.
- Research Article
34
- 10.2196/16969
- Aug 12, 2020
- JMIR Mental Health
BackgroundRecent research has emphasized the need for accessing information about patients to augment mental health patients’ verbal reports in clinical settings. Although it has not been introduced in clinical settings, computational linguistic analysis on social media has proved it can infer mental health attributes, implying a potential use as collateral information at the point of care. To realize this potential and make social media insights actionable to clinical decision making, the gaps between computational linguistic analysis on social media and the current work practices of mental health clinicians must be bridged.ObjectiveThis study aimed to identify information derived from patients’ social media data that can benefit clinicians and to develop a set of design implications, via a series of low-fidelity (lo-fi) prototypes, on how to deliver the information at the point of care.MethodsA team of clinical researchers and human-computer interaction (HCI) researchers conducted a long-term co-design activity for over 6 months. The needs-affordances analysis framework was used to refine the clinicians’ potential needs, which can be supported by patients’ social media data. On the basis of those identified needs, the HCI researchers iteratively created 3 different lo-fi prototypes. The prototypes were shared with both groups of researchers via a videoconferencing software for discussion and feedback. During the remote meetings, potential clinical utility, potential use of the different prototypes in a treatment setting, and areas of improvement were discussed.ResultsOur first prototype was a card-type interface that supported treatment goal tracking. Each card included attribute levels: depression, anxiety, social activities, alcohol, and drug use. This version confirmed what types of information are helpful but revealed the need for a glanceable dashboard that highlights the trends of these information. As a result, we then developed the second prototype, an interface that shows the clinical state and trend. We found that focusing more on the changes since the last visit without visual representation can be more compatible with clinicians’ work practices. In addition, the second phase of needs-affordances analysis identified 3 categories of information relevant to patients with schizophrenia: symptoms related to psychosis, symptoms related to mood and anxiety, and social functioning. Finally, we developed the third prototype, a clinical summary dashboard that showed changes from the last visit in plain texts and contrasting colors.ConclusionsThis exploratory co-design research confirmed that mental health attributes inferred from patients’ social media data can be useful for clinicians, although it also revealed a gap between computational social media analyses and clinicians’ expectations and conceptualizations of patients’ mental health states. In summary, the iterative co-design process crystallized design directions for the future interface, including how we can organize and provide symptom-related information in a way that minimizes the clinicians’ workloads.
- Research Article
5
- 10.1145/1570000/1564522/p66-pretorius.pdf
- Mar 31, 2009
Setswana, a Bantu language in the Sotho group, is one of the eleven official languages of South Africa. The language is characterised by a disjunctive orthography, mainly affecting the important word category of verbs. In particular, verbal prefixal morphemes are usually written disjunctively, while suffixal morphemes follow a conjunctive writing style. Therefore, Setswana tokenisation cannot be based solely on whitespace, as is the case in many alphabetic, segmented languages, including the conjunctively written Nguni group of South African Bantu languages. This paper shows how a combination of two tokeniser transducers and a finite-state (rule-based) morphological analyser may be combined to effectively solve the Setswana tokenisation problem. The approach has the important advantage of bringing the processing of Setswana beyond the morphological analysis level in line with what is appropriate for the Nguni languages. This means that the challenge of the disjunctive orthography is met at the tokenisation/morphological analysis level and does not in principle propagate to subsequent levels of analysis such as POS tagging and shallow parsing, etc. Indeed, the approach ensures that an aspect such as orthography does not obfuscate sound linguistics and, ultimately, proper semantic analysis, which remains the ultimate aim of linguistic analysis and therefore also computational linguistic analysis.
- Research Article
- 10.30564/fls.v7i11.11430
- Oct 22, 2025
- Forum for Linguistic Studies
Higher education faces increasing demands to develop students' ethical, creative, and emotional capabilities alongside academic mastery. While the World Economic Forum identifies critical thinking, creativity, emotional intelligence, and moral judgment as essential future workforce skills, most university curricula remain compartmentalized, prioritizing cognitive over affective and aesthetic domains. Current educational approaches lack empirically validated frameworks that systematically integrate moral development with artistic learning in digitally enhanced environments. This study designed, implemented, and evaluated an integrated moral-art curriculum module to foster undergraduate students' moral reasoning, creative thinking, and aesthetic sensitivity through blended learning approaches. Using Design and Development Research guided by the ADDIE instructional model, a quasi-experimental pre-post study was conducted with 50 undergraduate students from diverse academic backgrounds across four institution types. The 12-week module combined synchronous classroom instruction with asynchronous digital learning via the Learning Pass platform. Outcomes were assessed through validated pre-post questionnaires and computational linguistic analysis using LIWC and Coh-Metrix tools. Significant improvements occurred across all domains: moral reasoning increased 35.5% (Cohen's d = 0.91), creative thinking rose 32.3%, and aesthetic sensitivity improved 41.4% (d = 0.85). Linguistic analysis revealed enhanced lexical diversity (+16.4%), academic vocabulary (+41.5%), and empathy markers (+42.1%), with reduced anxiety language (−18.2%). Strong inter-domain correlations confirmed the integrated pedagogical framework's theoretical viability. Results provide empirical support for scalable, digitally enhanced interdisciplinary curricula uniting moral and artistic education in higher education contexts.
- Conference Article
6
- 10.3115/1564508.1564522
- Jan 1, 2009
Setswana, a Bantu language in the Sotho group, is one of the eleven official languages of South Africa. The language is characterised by a disjunctive orthography, mainly affecting the important word category of verbs. In particular, verbal prefixal morphemes are usually written disjunctively, while suffixal morphemes follow a conjunctive writing style. Therefore, Setswana tokenisation cannot be based solely on whitespace, as is the case in many alphabetic, segmented languages, including the conjunctively written Nguni group of South African Bantu languages. This paper shows how a combination of two tokeniser transducers and a finite-state (rule-based) morphological analyser may be combined to effectively solve the Setswana tokenisation problem. The approach has the important advantage of bringing the processing of Setswana beyond the morphological analysis level in line with what is appropriate for the Nguni languages. This means that the challenge of the disjunctive orthography is met at the tokenisation/morphological analysis level and does not in principle propagate to subsequent levels of analysis such as POS tagging and shallow parsing, etc. Indeed, the approach ensures that an aspect such as orthography does not obfuscate sound linguistics and, ultimately, proper semantic analysis, which remains the ultimate aim of linguistic analysis and therefore also computational linguistic analysis.
- Research Article
- 10.1177/17504813231207948
- Dec 12, 2023
- Discourse & Communication
Social media has become a powerful conduit for misinformation during major public events. As a result, an extant body of research has emerged on misinformation and its diffusion. However, the research is fragmented and has mainly focused on understanding the content of misinformation messages. Little attention is paid to the production and consumption of misinformation. This study presents the results of a detailed comparative analysis of the production, consumption, and diffusion of misinformation with authentic information. Our findings, based on extensive use of computational linguistic analyses of COVID-19 pandemic-related messages on the Twitter platform, revealed that misinformation and authentic information exhibit very different characteristics in terms of their contents, production, diffusion, and their ultimate consumption. To support our study, we carefully selected a sample of 500 widely propagated messages confirmed by fact-checking websites as misinformation or authentic information about pandemic-related topics from the Twitter platform. Detailed computational linguistic analyses were performed on these messages and their replies ( N = 198,750). Additionally, we analyzed approximately 1.2 million Twitter user accounts responsible for producing, forwarding, or replying to these messages. Our extensive and detailed findings were used to develop and propose a theoretical framework for understanding the diffusion of misinformation on social media. Our study offers insights for social media platforms, researchers, policymakers, and online information consumers about how misinformation spreads over social media platforms.
- Research Article
- 10.1002/alz.057876
- Dec 1, 2021
- Alzheimer's & Dementia
Multivariate computational linguistic analysis for early detection of cognitive impairment
- Book Chapter
4
- 10.5040/9798216005490.ch-007
- Jan 1, 1999
The Representation of the Homeless in U.S. Electronic Media: A Computational Linguistic Analysis
- Research Article
- 10.1108/gkmc-12-2023-0520
- Jul 30, 2024
- Global Knowledge, Memory and Communication
Purpose This study aims to explore the linguistic and syntactic textual features of Central Bank of Nepal (Nepal Rastra Bank [NRB])’s monetary policy. Considering the recent literature and methodological advancement in computational linguistic analysis, this study intends to explore the features of published monetary policy reports. Design/methodology/approach Text mining technique has been used on the monetary policy published by Central Bank of Nepal for the period between 2002/03 and 2021/22 to describe the textual features such as length, tone, degree of forward looking, use of numerical contents and readability. The raw text was tokenized using Python’s Natural Language Toolkit. Considering the LM dictionary, the frequency of tokens matching with dictionary is computed and divided by total number of words to normalize the obtained value. Findings This study found that NRB publishes lengthy monetary policies during economic contractions and vice versa. The tone of the policies are pessimistic most of the time. NRB’s policies are not sufficiently forward looking and complex to be comprehended by layman. Ergo, NRB shall form a team of communication experts to ensure publication of optimistic policies with appropriate linguistic features. Furthermore, publishing the minutes of monetary policy meetings will help enhance effective communication through transparency and proper functioning of expectations channel. Originality/value To the best of the authors’ knowledge, no similar study has been conducted to assess the textual features of monetary policy in Nepal. And this study will be helpful to gauge the status of central bank communication in the context of emerging and least developed countries.
- Research Article
- 10.59075/cp869f71
- Mar 27, 2025
- The Critical Review of Social Sciences Studies
This study examines the influence of Doraemon cartoons on children's language acquisition, cognitive development, behavior, and emotional growth. Utilizing advanced tools such as Anaconda, Jupyter Notebook, the Natural Language Toolkit (NLTK), VADER for sentiment analysis, and pyLDAvis for visualization, the research employs Latent Dirichlet Allocation (LDA) topic modeling to identify five key themes—Adventure and Exploration with Technology, Daily Life and Decision-Making, Social Dynamics and Doraemon’s Gadgets, Domestic Challenges and Desires, and Economic and Mechanical Solutions—reflecting the integration of sci-fi imagination with childhood challenges. These themes, rich with verbs and adjectives like "useful," "mysterious," and action-oriented terms (e.g., "explore," "solve"), foster lexical acquisition by embedding new vocabulary in engaging, fantastical contexts, enhancing children's semantic networks and narrative comprehension. The sci-fi narratives stimulate cognitive development by encouraging problem-solving, creativity, and emotional regulation through stories of gadget-enabled triumphs and setbacks. The analysis further explores LDA visualization, Distribution of Thematic Categories, Average Sentiment Polarity of Doraemon Gadgets, Distribution of Emotions, Rolling Sentiment Polarity Trends, Emotional Growth Timeline, Nature-Related Words, and Human Relationships and Social Bonds. Findings reveal that gadgets dominate thematic content (51% frequency), with both positive and negative sentiment, while emotions like anticipation (0.14) and joy (0.10) prevail, alongside fluctuating sentiment trends tied to adventure, growth, and failure. The research underscores Doraemon’s role in promoting technological curiosity, moral reasoning, emotional intelligence, and linguistic growth, offering valuable insights for educators and parents to harness its educational potential for childhood development.
- Research Article
- 10.2139/ssrn.2830748
- Aug 29, 2016
- SSRN Electronic Journal
The purpose of this article is to examine the psychological elements of the ideology of members of the major parties in the Australian federal parliament using computational linguistics. The cohort consists of the 485 Labor, Liberal and National parliamentarians who were in parliament over the period April 1996 to July 2014. I use computational linguistics to extract linguistic variables from first speeches in parliament of those in the cohort. I draw from methods used in machine learning to develop a classifier which has a 74% out of sample (leave-one-out cross validation) accuracy in classifying parliamentarians as liberal (ALP) or conservative (Liberal/National Party Coalition). I then examine the salient variables and find that there are only six linguistic markers of conservative/liberal ideology. Of these, two are consistent with the previous findings that liberals tend to display more psychological 'openness' than conservatives and less psychological 'conscientiousness'. However, one of these variables strongly challenges the idea that conservatives look to the past and liberals to the future. Two of the linguistic variables are 'suppressor' variables and I discuss these variables in the context of their role in suppressing 'irrelevant' variance in the other independent variables.
- Research Article
- 10.5281/zenodo.1299691
- Jan 6, 2015
- Zenodo (CERN European Organization for Nuclear Research)
Part of Speech (POS) is a very vital topic in Natural Language Processing (NLP) task in any language, which involves analysing the construction of the language, behaviours and the dynamics of the language, the knowledge that could be utilized in computational linguistics analysis and automation applications. In this context, dealing with unknown words (words do not appear in the lexicon referred as unknown words) is also an important task, since growing NLP systems are used in more and more new applications. One aid of predicting lexical categories of unknown words is the use of syntactical knowledge of the language. The distinction between open class words and closed class words together with syntactical features of the language used in this research to predict lexical categories of unknown words in the tagging process. An experiment is performed to investigate the ability of the approach to parse unknown words using syntactical knowledge without human intervention. This experiment shows that the performance of the tagging process is enhanced when word class distinction is used together with syntactic rules to parse sentences containing unknown words in Sinhala language.
- Research Article
7
- 10.1080/17467586.2015.1038286
- Jun 29, 2015
- Dynamics of Asymmetric Conflict
We investigated linguistic patterns in the discourse of three prominent autocratic leaders whose tenure lasted for multiple decades. The texts of Fidel Castro, Zedong Mao, and Hosni Mubarak were analyzed using a computational linguistic tool (Coh-Metrix) to explore persuasive linguistic features during social disequilibrium and stability. The analyses were guided by the elaboration likelihood model of persuasion, which contrasts central versus peripheral routes to persuasion. Results show these leaders utilize the central persuasion route, with more formal discourse patterns during times of crises versus non-crises. A significant interaction between leader age and armed conflict revealed interesting adaptive characteristics. Specifically, leaders' formality decreases over time in both crises and non-crises times, but this attenuation is less prominent during crisis periods. The implications of these results are discussed in the context of using computational linguistics analyses to generate potential predictive models of social disequilibrium and to advance our understanding of authoritarian regimes.
- Conference Article
- 10.1109/bigdata.2018.8622319
- Dec 1, 2018
Increasing globalization of the world leads to an emerging need for ways to analysis and understand groups from different cultures and ideologies. Researchers have used written text as a medium to examine political discourse and analyze value-motivated groups. Previous works showed that computational linguistic analysis can be performed to infer the flexibility of value-motivated groups from their writings. The main premise of these works is that text can bring insights into individuals’ and groups’ way of thinking, and potentially, behaviour. While existing works provide viable solutions for characterizing groups’ ideological behaviour, they perform their analyses over all text published by the groups. However, researchers have found that religious and value-motivated groups can’t be analyzed collectively as they regularly evolve. To address this gap, we analyze the performance of existing methods to single documents. Experimental results show that previous features (e.g., use of pronouns and judgment statements) used to predict groups’ flexibility are less predictive for single documents’ flexibility. We show that a newly added feature regarding the identity of a group provides a significant contribution to the prediction process. Furthermore, due to the unbalanced nature of our data, we propose a weighting scheme for linear regression based on the inter-group variance. Results indicate that a weighted least squares significantly outperforms a traditional least squares approach. This work brings new insights into the characteristics of different linguistic and performative signals, and their relationship to the linguistic flexibility of groups. It also provides a decision making support tool for practical use by practitioners.
- Ask R Discovery
- Chat PDF
AI summaries and top papers from 250M+ research sources.