Abstract

The generation and spread of fake news within new and online media sources is emerging as a phenomenon of high societal significance. Combating them using data-driven analytics has been attracting much recent scholarly interest. In this computational social science study, we analyze the textual coherence of fake news articles vis-a-vis legitimate ones. We develop three computational formulations of textual coherence drawing upon the state-of-the-art methods in natural language processing and data science. Two real-world datasets from widely different domains which have fake/legitimate article labellings are then analyzed with respect to textual coherence. We observe apparent differences in textual coherence across fake and legitimate news articles, with fake news articles consistently scoring lower on coherence as compared to legitimate news ones. While the relative coherence shortfall of fake news articles as compared to legitimate ones form the main observation from our study, we analyze several aspects of the differences and outline potential avenues of further inquiry.

Highlights

  • The spread of fake news is increasingly being recognized as a global issue of enormous significance

  • There has been some recent interest in characterizing fake news in terms of various aspects of textual content. Previous work along this direction has considered satirical cues [26], expression of stance [4], rhetorical structures [25] and topical novelty [30]. In this computational social science study, we evaluate the textual coherence of fake news articles vis-a-vis legitimate ones

  • We observe that moving to higher-level features for fake news identification has not yet been a widespread trend within the data science community; this is likely to be due to the fact that investigating specific trends do not necessarily individually improve the state-of-the-art for fake news detection using conventional metrics such as empirical accuracy that are employed within data science

Read more

Summary

Introduction

The spread of fake news is increasingly being recognized as a global issue of enormous significance. The news ecosystem has evolved from a small set of regulated and trusted sources to numerous online news sources and social media Such new media sources come with limited liability for disinformation, and are easy vehicles for fake news. There has been some recent interest in characterizing fake news in terms of various aspects of textual content. Previous work along this direction has considered satirical cues [26], expression of stance [4], rhetorical structures [25] and topical novelty [30]. In this computational social science study, we evaluate the textual coherence of fake news articles vis-a-vis legitimate ones. Cohesion has been considered as an important feature for assessing the structure of text [21] and has been argued to play a role in writing quality [3,6,17]

Related Work
Research Objective and Computational Framework
Computational Framework for Lexical Coherence
Profiling Fake and Legitimate News Using Lexical Coherence
Computational Approaches
Datasets
Experimental Setup
Analysis of Text Coherence
Towards More Generic Fake News Detection
Findings
Lexical Coherence and Other Disciplines
Conclusions and Future Work
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.