Abstract

The problem of automatic analysis of argumentation in scientific communication texts is considered. Argumentation is understood as an ordered set of arguments used to support a certain thesis. An argument includes at least one premise and one conclusion, connected by an argumentative relation. The purpose of the work is an experimental study of neu-ral network approaches to solving the problem of searching and extracting argumentative relations between statements located closely in the text. The study was conducted on a corpus of texts with argumentative markup created using the previously developed web platform. The corpus included texts of scientific news, analytical articles from the Habr web-site, scientific articles and reviews. Datasets for machine learning were built based on these texts. To improve the quali-ty of neural network models training, these sets were supplemented with new data by using automatic paraphrasing and double translation methods. Two approaches to training models were considered: the first one with labeling of indi-cators in texts and the second one with preliminary training of a language model on the task of predicting indicators. To evaluate the models performance, an approach was proposed based on estimates of agreement between experts, usu-ally used to compare markups of manually created texts. A comparison of agreement coefficients between experts and trained models showed that the quality threshold for extracting argumentative relations was almost reached on the model with labeled indicators. A manual analysis of model errors was carried out by visualizing the obtained results. Thus, the novelty of the work lies in the application of an integrated approach to creating data sets, training models and evaluating the results obtained from the automatic extraction of argumentative relations.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call