Abstract

Abstract Academic writing is a complex task that requires an author to be skilled in argumentation. The goal of an academic author is to communicate ideas clearly and to convince the reader of the veracity of their claims. However, few students in lower level courses are good arguers, as this is a skill that takes time to master. With the aim of contributing to the development of this skill, we introduce a freely available annotated corpus to support research on argumentation in Spanish. To build the corpus, we developed an annotation guide to identify argumentation in paragraphs. The guide also outlines how to annotate segments of sentences as either claims or premises and how to indicate relations (support or attack) between such components. Then, an annotated corpus of 444 sections was created, and an almost perfect interannotator agreement rate was achieved for a number of its components. After its construction, the corpus was used to perform an exploratory analysis, which aimed to identify the amount of argumentation in each section as well as the patterns commonly used for argument identification. Thereafter, experiments were conducted to automatically classify argumentative components using lexical, semantic, discourse, structural, and sentiment features. A Document Occurrence Representation was also tested on the corpus constituting a novel approach in the representation of documents for automatic argument detection. The results of these experiments to automatically classify argumentative components in Spanish are very promising and seem to support the idea that linguistic feature engineering plays a crucial role in text classification.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call