Abstract

The multifunctional tool this paper presents has been developed within the TAGFACT project, a project that aims to automate the annotation of factuality –understood as the degree of commitment with which the writer presents situations– in Spanish journalistic texts. In what follows, the tool, which allows the compilation of the texts and the manual annotation of predicates, is described. The corpus created using it has been extracted in groups of three pieces of news covering the same event from newspapers with different ideologies (left wing, right wing and centrist). It is made up of 176 different pieces of news, containing 1,359 sentences and 46,947 words. The tool has been used so far to manually annotate a section of the ‘Gold Standard’ (approximately 10,000 words). It has proved to be versatile in that it allows for both the creation and management of corpora and corpus annotation, using any tags the user wants depending on the purpose of each corpus.

Highlights

  • The categorization of events with respect to their factual status is an area of growing interest in the field of Corpus Linguistics and Natural Language Processing

  • SUMMARY AND CONCLUSIONS In this paper we have presented the tool created in the TAGFACT project, whose main objective is to create a tool to automatically annotate factuality in Spanish

  • This task has become especially relevant in the last few years in the field of Natural Language Processing

Read more

Summary

INTRODUCTION

Several projects dealing with the annotation of corpora, either manual or automatic, with this type of information have been developed. The objective of our project (TAGFACT), which is two years into its development, is to create a system for the automatic annotation of the degree of certainty implicit in the situations narrated in Spanish journalistic texts, an annotation. One of the first steps in our project was the creation of a corpus of Spanish journalistic texts (the TAGFACT corpus) and a portion of this corpus, which will constitute the ‘Gold Standard’, is being annotated manually.

THE ANNOTATION OF FACTUALITY
THE TAGFACT CORPUS
Corpus annotation
Findings
SUMMARY AND CONCLUSIONS
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.