Abstract

When assessing the similarity of arguments, researchers typically use approaches that do not provide interpretable evidence or justifications for their ratings. Hence, the features that determine argument similarity remain elusive. We address this issue by introducing novel argument similarity metrics that aim at high performance and explainability. We show that Abstract Meaning Representation (AMR) graphs can be useful for representing arguments, and that novel AMR graph metrics can offer explanations for argument similarity ratings. We start from the hypothesis that similar premises often lead to similar conclusions—and extend an approach for AMR-based argument similarity rating by estimating, in addition, the similarity of conclusions that we automatically infer from the arguments used as premises. We show that AMR similarity metrics make argument similarity judgements more interpretable and may even support argument quality judgements. Our approach provides significant performance improvements over strong baselines in a fully unsupervised setting. Finally, we make first steps to address the problem of reference-less evaluation of argumentative conclusion generations.

Highlights

  • Introduction language models such asBERT (Devlin et al, 2019) or InferSent (Conneau et al, 2017)

  • This system is closely followed by other Abstract Meaning Representation (AMR)-based systems, e.g., using concept-focused S2MATCH that sees only the argument (68.17 macro tune it on premise-conclusion samples from Stab and Gurevych (2017), which contain intelligible

  • Not much is known about how to rate the quality of a conclusion drawn from an argument. We explore this question by performing a manual assessment of different quality aspects of conclusions, and investigate to what extent these can be assessed with our AMR similarty metrics

Read more

Summary

Introduction

Introduction language models such asBERT (Devlin et al, 2019) or InferSent (Conneau et al, 2017). Two key advantages of such approaches are due to their unsupervised setup: First, unsupervised methods do not rely on human annotations, which are expensive and can be subject to noise and biases. It has been shown for previous supervised methods that they have learned less about argumentation tasks than had been assumed, by exploiting spurious clues and artifacts from manually created data (Opitz and Frank, 2019; Niven and Kao, 2019).

Objectives
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.