KEYWORDS IN TERMS OF FREQUENCY AND THEMATIC RELEVANCE

Natalia P Galkina

doi:10.34216/1998-0817-2022-28-3-180-185

Abstract

The article focuses on the role of keywords, their statistical data for determining the thematic dominance when working with large arrays of texts. The description is based on the materials of a typological linguistic study of the texts of military songs, the period of 1939-1945, in English and Russian. The selection of keywords was carried out on the basis of semantic, lexical-syntactic, morphological analysis, taking into account the frequency of their use. The frequency of using a word may not always be a defining feature for marking it as a keyword. Within the framework of one text, the keywords may be words that help understand the sense, unravel its deep meaning, remember the content. When combining a large number of texts, by authorship, chronology, thematic, stylistic or other relatedness, the frequency of keywords matters and can serve as a determining factor, a classification criterion. This paper shows that the results of the thematic distribution of texts based on the semantic analysis of their content correspond to the results of statistical analysis of the keywords and are confirmed by machine quantitative indicators of their frequency. The results are relevant for both Russian-language and English- language materials.

Full Text