Journalistic transparency using CRFs to identify the reporter of newspaper articles in Spanish

Francisco Jurado

doi:10.1016/j.asoc.2020.106496

Francisco Jurado

Open Access

PDF Available

https://doi.org/10.1016/j.asoc.2020.106496

Copy DOI

Export

Save

Cite

Journal: Applied Soft Computing	Publication Date: Jun 23, 2020
Citations: 4	License type: other-oa

Affiliation: Autonomous University of Madrid

Abstract
Full-Text PDF
Similar Papers

Abstract

Listen

Journalistic transparency rises as a key issue against the lack of credibility to which journalists are exposed, as well as the media manipulators and fake news providers. With the use of Natural Language Processing (NLP) and Machine Learning (ML), it is possible to automate the extraction of information from newspaper articles to know what the sources of information are to verify their veracity. Along with this article, we present the application of Conditional Random Fields (CRFs) for a specific type of Entity Recognition (ER) task, namely, to identify what we have called the “reporter” in newspaper articles, i.e., who or what is the provider of the information. Thus, we have created a labelled corpus for the Spanish language and trained and analysed several CRFs models with a set of specific features. The obtained results suppose a solid baseline for our goal.

Full Text