Abstract

Abstract. The presented research considers the problems of studying the cluster structure of multidimensional data volumes. This paper presents the results of numerical experiments on the study of data volumes consisting of frequencies of joint use of words from different parts of speech, for instance “noun + verb” or “adjective + noun”. The volumes of data are obtained from samples from text collections in Russian. The aim of the research is to analyze the cluster structure of the studied volume and semantic proximity of words in clusters and subclusters. The hypothesis was used that words with similar meaning should occur in approximately the same context. In this regard, in the space of features, they will be at a relatively close distance from each other, while differing words will be at a more distant distance from each other. Research is carried out using elastic maps, which are effective tools for visual analysis of multidimensional data. The construction of elastic maps and their extensions in the space of the first three principal components makes it possible to determine the cluster structure of the studied multidimensional data volumes. Such analysis can be useful in the tasks of confronting negative verbal influences such as fake news, hidden propaganda, involvement in sects, verbal manipulation, etc. Also this approach can be applied to text collections having medical origin.

Highlights

  • The tasks of analyzing multidimensional data are currently one of the main directions in Computer Science, computational mathematics, mathematical modeling, computer engineering

  • The main attention is paid to the study of the possibility of applying the methods of elastic maps for the analysis of thematic proximity of Russian words

  • We studied a transposed data set, where nouns played the role of measurements, and adjectives were considered as points in a multidimensional data set

Read more

Summary

Introduction

The tasks of analyzing multidimensional data are currently one of the main directions in Computer Science, computational mathematics, mathematical modeling, computer engineering. An analytical study of data, their generalization and identification of key dependencies allows us to see the meaning in their very existence. The need to process, visualize and analyze multidimensional [data has led to the intensive development of visual analytics tools (Wong, Thomas, 2004), (Thomas, Cook, 2005), (Kielman, Thomas, 2009), (Keim et al, 2010). The approaches and methods of visual analytics are constantly evolving and provide users with sufficiently reliable tools for solving many practical problems of multidimensional data exploration. These tasks include the tasks of data classification, cluster detection, identification of key defining parameters, establishing relationships between key parameters, etc

Objectives
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call