Abstract
Journalistic transparency rises as a key issue against the lack of credibility to which journalists are exposed, as well as the media manipulators and fake news providers. With the use of Natural Language Processing (NLP) and Machine Learning (ML), it is possible to automate the extraction of information from newspaper articles to know what the sources of information are to verify their veracity. Along with this article, we present the application of Conditional Random Fields (CRFs) for a specific type of Entity Recognition (ER) task, namely, to identify what we have called the “reporter” in newspaper articles, i.e., who or what is the provider of the information. Thus, we have created a labelled corpus for the Spanish language and trained and analysed several CRFs models with a set of specific features. The obtained results suppose a solid baseline for our goal.
Accepted Version (Free)
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have