Modern Talking in Key Point Analysis: Key Point Matching using Pretrained Encoders

Jan Heinrich Reimer,Max Henze,Thi Kim Hanh Luu,Yamen Ajjour

doi:10.18653/v1/2021.argmining-1.18

Abstract

We contribute to the ArgMining 2021 shared task on Quantitative Summarization and Key Point Analysis with two approaches for argument key point matching. For key point matching the task is to decide if a short key point matches the content of an argument with the same topic and stance towards the topic. We approach this task in two ways: First, we develop a simple rule-based baseline matcher by computing token overlap after removing stop words, stemming, and adding synonyms/antonyms. Second, we fine-tune pretrained BERT and RoBERTalanguage models as aregression classifier for only a single epoch. We manually examine errors of our proposed matcher models and find that long arguments are harder to classify. Our fine-tuned RoBERTa-Base model achieves a mean average precision score of 0.913, the best score for strict labels of all participating teams.

Highlights

Arguments influence our decisions in many places of our daily life (Bar-Haim et al, 2020a)
The dataset used in the ArgMining 2021 shared task on Quantitative Summarization and Key Point Analysis is the ArgKP-2021 dataset (Bar-Haim et al, 2020a) which consists of 24 083 argument and key point pairs labeled as matching/nonmatching
The task organizers claim that this removal of 50 % of the pairs is necessary because some arguments do not match any of the key points, which would influence mean average precision negatively (Friedman et al, 2021)

Summary

Introduction

Arguments influence our decisions in many places of our daily life (Bar-Haim et al, 2020a). In the ArgMining 2021 shared task on Quantitative Summarization is relatively small (24 083 labelled pairs), we decide to fine-tune BERT and RoBERTa language models rather than train a neural classifier from scratch (Section 4) Contrasting this neural approach, we introduce a simple rule-based baseline matcher that compares preprocessed tokens of each argument to the tokens of each key point (Section 4). The dataset used in the ArgMining 2021 shared task on Quantitative Summarization and Key Point Analysis is the ArgKP-2021 dataset (Bar-Haim et al, 2020a) which consists of 24 083 argument and key point pairs labeled as matching/nonmatching. They all belong to one of 28 controversial topics, for example: “Assisted suicide should be a criminal offence”.

Token Overlap Baseline

Transformers Fine-tuning

Results

Findings

Conclusion and Future Work

Future Work