Improving LIWC Using Soft Word Matching

Yuan Gong,Kevin Shin,Christian Poellabauer

doi:10.1145/3233547.3233632

Abstract

The widely deployed and easy-to-use Linguistic Inquiry and Word Count (LIWC) tool is the gold standard for many computerized text analysis tasks for many medical applications such as patient sentiment analysis, depression detection, and ADHD detection. Compared to most other natural language processing (NLP) tasks, in the medical field it is often very difficult to obtain large-scale data sets, making effective automatic representation learning from complex text patterns (e.g., using a deep auto-encoder) challenging. LIWC can solve this problem by using a human-designed dictionary as a substitution of a machine learning model to convert text into a concise and effective vector representation. However, while LIWC's dictionary is large, some potentially informative words might still be neglected due to the knowledge constraint of the dictionary editors. This problem is particularly conspicuous when the analyzed text is not a formal language (e.g., dialect, slang, or cyber words). To address this problem, we propose a new matching scheme that does not require an exact word match, but instead counts all words that are similar to a key in the LIWC dictionary. This scheme is implemented using WordNet, a large lexical database, and Word2Vec, a machine learning based word embedding technology. The output of the proposed method is in the exact same format as LIWC's output, thereby maintaining the usability. Similar to previous work, the proposed method can be viewed as a combination of human domain knowledge and machine learning for text representation encoding.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Improving LIWC Using Soft Word Matching

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Identification of Emotional Expression With Cancer Survivors: Validation of Linguistic Inquiry and Word Count.
Michelle Mcdonnell ... Erin O'Carroll Bantum
JMIR Formative Research | VOL. 4
Michelle Mcdonnell, et. al.Michelle Mcdonnell ... Erin O'Carroll Bantum
30 Oct 2020
JMIR Formative Research | VOL. 4

Creating and Testing Specialized Dictionaries for Text Analysis
Роман Тарабань ... Талін Налбандян
East European Journal of Psycholinguistics | VOL. 6
Роман Тарабань, et. al.Роман Тарабань ... Талін Налбандян
30 Jun 2019
East European Journal of Psycholinguistics | VOL. 6

Machine Learning for Identifying Emotional Expression in Text: Improving the Accuracy of Established Methods.
Erin O’Carroll Bantum ... Joanne Buzaglo
Journal of technology in behavioral science | VOL. 2
Erin O’Carroll Bantum, et. al.Erin O’Carroll Bantum ... Joanne Buzaglo
01 Mar 2017
Journal of technology in behavioral science | VOL. 2

Inferring Grandiose Narcissism From Text: LIWC Versus Machine Learning
Andrew D Cutler ... Hannah L Dorough
Journal of Language and Social Psychology | VOL. 40
Andrew D Cutler, et. al.Andrew D Cutler ... Hannah L Dorough
03 Jul 2020
Journal of Language and Social Psychology | VOL. 40

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Improving LIWC Using Soft Word Matching

Abstract

Talk to us

Similar Papers