Abstract

The subject of the study is a methodology for analyzing the electronic content of social networks (forums) as a historical source. The discussion of the revolution of 1917 during the centenary of this historical event was used as a material for analysis. The aim of the study was to test approaches to the methodology of working with large arrays of online texts, and the possible combination of two approaches to working with online texts - quantitative analysis tools (distant reading) and traditional methods of working with historical text (slow reading). As part of the "distant reading", thematic modeling is used using the LDA (latent Dirichlet placement) and LSA (latent semantic analysis) algorithm in the R programming environment in the R studio program (version 4.2.1). During the "slow reading" we analyze the entire volume of the text directly.The novelty of the research lies in the application of thematic modeling to sources in the R programming environment in conjunction with classical methods of analyzing historical texts. Within the framework of the study, a methodology for analyzing the content of social networks (forums) has been tested, focused on substantial arrays of text that are physically impossible to read in full or at least in a significant part, using exclusively traditional means of interaction of the researcher with the corpus of sources. A step-by-step research algorithm is proposed, in which the researcher needs to analyze the text by "distant reading" methods, identifying the topics of texts consisting of terms (words). Then, using these keywords, you should find the relevant text fragments in which the identified topic was discussed most actively, and analyze the fragments in more detail using traditional methods of working with a text source. A possible way to improve the quality of identifying topics necessary for the researcher in social networks and forums by the LDA algorithm is proposed, namely, preliminary splitting of a large text and subsequent analysis of fragments by the LDA method as separate documents.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call