Abstract

In this paper we present US2016, the largest publicly available set of corpora of annotated dialogical argumentation. The annotation covers argumentative relations, dialogue acts and pragmatic features. The corpora comprise transcriptions of television debates leading up to the 2016 US presidential elections, and reactions to the debates on Reddit.These two constitutive parts of the corpora are integrated by means of the intertextual correspondence between them. The rhetorical richness and high argument density of the communicative context results in cross-genre corpora that are robust resources for the study of the dialogical dynamics of argumentation in three ways: first, in empirical strands of research in discourse analysis and argumentation studies; second,in the burgeoning field of argument mining where automatic techniques require such data; and third, in formulating algorithmic techniques for sensemaking through the development of Argument Analytics.

Highlights

  • Argument and debate are as ubiquitous as they are fundamental to the functioning of society

  • The US2016 corpus and its component parts are a unique set of resources that represents a number of firsts

  • US2016tv is the largest corpus of analysed dialogical argumentation currently available

Read more

Summary

Introduction

Argument and debate are as ubiquitous as they are fundamental to the functioning of society. The lack of data has been severely hampering such research and has been hobbling development in the nascent field of argument mining in particular The dearth of such resources is rooted in two key challenges: first, the technical challenge of distilling the rich work of argumentation theory into a theoretically coherent approach which can be translated into a practical set of annotation guidelines; and second, the prosaic challenge of the labourintensive nature of annotation, given that it typically requires training and is not, in general, delegable to crowdsourced solutions. We include precisely contemporaneous reaction online, and in particular, from the social media platform Reddit This lays the scene for an unusually rich dataset, which captures dialogical interaction (as opposed to monological—and often artificially generated—argument which is much more common), and allows exploration of reaction in social media. Brief indication is offered of the types of research benefiting from the newly developed resource (Sect. 6), and how this relates to the existing literature (Sect. 7)

Argumentation in discourse
Argumentation in televised election debates
Argumentation in political social media discussions
Theoretical foundations
Summary of annotation guidelines
Annotation software
Data collection
Structure and availability of the corpus
Validation
Corpus properties
Intertextual correspondence
Annotation of intertextual correspondence
The intertextual correspondence sub-corpus
Related work
Findings
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call