Abstract

The article focuses on the role of keywords, their statistical data for determining the thematic dominance when working with large arrays of texts. The description is based on the materials of a typological linguistic study of the texts of military songs, the period of 1939-1945, in English and Russian. The selection of keywords was carried out on the basis of semantic, lexical-syntactic, morphological analysis, taking into account the frequency of their use. The frequency of using a word may not always be a defining feature for marking it as a keyword. Within the framework of one text, the keywords may be words that help understand the sense, unravel its deep meaning, remember the content. When combining a large number of texts, by authorship, chronology, thematic, stylistic or other relatedness, the frequency of keywords matters and can serve as a determining factor, a classification criterion. This paper shows that the results of the thematic distribution of texts based on the semantic analysis of their content correspond to the results of statistical analysis of the keywords and are confirmed by machine quantitative indicators of their frequency. The results are relevant for both Russian-language and English- language materials.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call