Abstract

BackgroundGiven the high volume of text-based communication such as email, Facebook, Twitter, and additional web-based and mobile apps, there are unique opportunities to use text to better understand underlying psychological constructs such as emotion. Emotion recognition in text is critical to commercial enterprises (eg, understanding the valence of customer reviews) and to current and emerging clinical applications (eg, as markers of clinical progress and risk of suicide), and the Linguistic Inquiry and Word Count (LIWC) is a commonly used program.ObjectiveGiven the wide use of this program, the purpose of this study is to update previous validation results with two newer versions of LIWC.MethodsTests of proportions were conducted using the total number of emotion words identified by human coders for each emotional category as the reference group. In addition to tests of proportions, we calculated F scores to evaluate the accuracy of LIWC 2001, LIWC 2007, and LIWC 2015.ResultsResults indicate that LIWC 2001, LIWC 2007, and LIWC 2015 each demonstrate good sensitivity for identifying emotional expression, whereas LIWC 2007 and LIWC 2015 were significantly more sensitive than LIWC 2001 for identifying emotional expression and positive emotion; however, more recent versions of LIWC were also significantly more likely to overidentify emotional content than LIWC 2001. LIWC 2001 demonstrated significantly better precision (F score) for identifying overall emotion, negative emotion, and anxiety compared with LIWC 2007 and LIWC 2015.ConclusionsTaken together, these results suggest that LIWC 2001 most accurately reflects the emotional identification of human coders.

Highlights

  • Recent studies have provided evidence that emotions can be effectively identified in written text [1,2,3,4]

  • Many computerized text analysis programs are not validated for the identification of emotional expression and still have time-consuming data preparation to ensure that the text is clear of all typographical errors [9]

  • The average percentage of words identified by Linguistic Inquiry and Word Count (LIWC) 2001, LIWC 2007, LIWC 2015, and human coders as emotion, positive emotion, and negative emotion as well as specific subcategories of anxiety, anger, and sadness ranged from 0.1% to 4.1% of total words (Table 1)

Read more

Summary

Introduction

Recent studies have provided evidence that emotions can be effectively identified in written text [1,2,3,4]. Computer analysis has become increasingly more efficient in evaluating written text, it lacks the nuance and accuracy provided by human coders [9]. Qualitative analysis provides the most complete method for characterizing text-based communications [10], the cost, time requirements, and subjectivity of manual coding make these methods prohibitively difficult for many applications. Many computerized text analysis programs are not validated for the identification of emotional expression and still have time-consuming data preparation to ensure that the text is clear of all typographical errors [9]. Given the high volume of text-based communication such as email, Facebook, Twitter, and additional web-based and mobile apps, there are unique opportunities to use text to better understand underlying psychological constructs such as emotion. Emotion recognition in text is critical to commercial enterprises (eg, understanding the valence of customer reviews) and to current and emerging clinical applications (eg, as markers of clinical progress and risk of suicide), and the Linguistic Inquiry and Word Count (LIWC) is a commonly used program

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call