Abstract

The study was carried out within the framework of identification linguistics and translational linguistics. The article describes some methods for determining the degree of similarity of texts: shingle algorithm, Levenshtein distance, systems for detecting plagiarism. The purpose of the work is to test the software capabilities of comparing texts for similarity, establishing their identity, and checking uniqueness. In a broad sense, these tasks fall within the area of text identification. In a qualitative (manual) assessment of the similarity of texts, identifying parameters are selected and selected specifically for the text under study. The use of electronic resources is determined by the desire for objectivity of the methods used to establish the identity of texts and the objectivity of the results obtained. Software products also make it possible to establish another, quantitative, characteristic — the degree of similarity of texts to each other or the degree of originality of the text. The work used services whose tasks include 1) comparing the similarity of two texts; 2) calculation of the Levenshtein distance; 3) detection of borrowing. The research material was an excerpt from an interview with Foreign Minister Sergei Lavrov. Reverse machine translation texts served as options for comparison with the source text. Reverse machine translation as a translation product is part of artificial intelligence and a model of the process of understanding and interpreting natural language. The results of using the proposed services made it possible to arrange five reverse machine translation options from the most unique text to the text that is most identical to the invariant. The study showed that the programs generally produce similar results, which can be applicable to solving research and applied problems related to establishing the identity and difference of texts. The prospect of the study is to identify lexical parameters that make it possible to classify reverse machine translation texts as the most or least identical with respect to the invariant.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call