Abstract

Mass media not only reflect the activities of state bodies but also shape the informational context, sentiment, depth, and significance level attributed to certain state initiatives and social events. Multilateral and quantitative (to the practicable extent) assessment of media activity is important for understanding their objectivity, role, focus, and, ultimately, the quality of the society’s “fourth power”. The paper proposes a method for evaluating the media in several modalities (topics, evaluation criteria/properties, classes), combining topic modeling of the text corpora and multiple-criteria decision making. The evaluation is based on an analysis of the corpora as follows: the conditional probability distribution of media by topics, properties, and classes is calculated after the formation of the topic model of the corpora. Several approaches are used to obtain weights that describe how each topic relates to each evaluation criterion/property and to each class described in the paper, including manual high-level labeling, a multi-corpora approach, and an automatic approach. The proposed multi-corpora approach suggests assessment of corpora topical asymmetry to obtain the weights describing each topic’s relationship to a certain criterion/property. These weights, combined with the topic model, can be applied to evaluate each document in the corpora according to each of the considered criteria and classes. The proposed method was applied to a corpus of 804,829 news publications from 40 Kazakhstani sources published from 01 January 2018 to 31 December 2019, to classify negative information on socially significant topics. A BigARTM model was derived (200 topics) and the proposed model was applied, including to fill a table of the analytical hierarchical process (AHP) and all of the necessary high-level labeling procedures. Experiments confirm the general possibility of evaluating the media using the topic model of the text corpora, because an area under receiver operating characteristics curve (ROC AUC) score of 0.81 was achieved in the classification task, which is comparable with results obtained for the same task by applying the BERT (Bidirectional Encoder Representations from Transformers) model.

Highlights

  • According to the 2019 Edelman Trust Barometer survey conducted in 27 countries, trust in government information and media channels remains low

  • The method we propose is based on a topic modeling and multiple-criteria decision making (MCDM) approach, which reduces the cost of labeling the corpus of documents

  • The main goal of such a tool is to provide experts, researchers, managers, and supervisors with a comprehensive and powerful set of analytical tools to obtain up-to-date relevant reports, visualizations, and evaluations of public mass media publications in a certain given area of interest. To address these tasks in a situation when obtaining large volumes of labeled texts is not possible, the MCDM approach can be applied in combination with the topic modeling of news corpora

Read more

Summary

Introduction

According to the 2019 Edelman Trust Barometer survey conducted in 27 countries, trust in government information and media channels remains low. The availability of various news sources on the Internet is an additional factor affecting our perception, which can create confusion caused by personal subjective thoughts, such as personal TV, blogs, and unproven news [5] In this regard, it is important to understand how the media use their influence to mitigate the negative impacts of the media and encourage positive effects [6]. We discuss the results obtained, the advantages and disadvantages of the proposed algorithm, and outline the directions for future research

Media Monitoring Tasks and Tools
Topic Modeling and MCDM
MMA Workflow
MMA–Description of Implementation
Data and Verification
February 2019
Conclusions
21 February 2018
21 July 2019
31 May 2018
Findings
22 August 2018
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call