Abstract

A new method for the recognition of meaningful changes in social state based on transformations of the linguistic content in Arabic newspapers is suggested. The detected alterations of the linguistic material in Arabic newspapers play an indicator role. The currently proposed approach acts in an “online” fashion and uses pre-trained vector representations of Arabic words. After a pre-processing stage, the words in the issues’ texts are substituted by vectors obtained within a word embedding methodology. The approach typifies the consistent linguistic template by the similarity of the embedded vectors. A change in the distributions of the issue-grounded samples indicates a difference in the underlying newspaper template. A two-step procedure implements the concept, where the first step compares the similarity distribution of the current issue versus the union of ones corresponding to several of its predecessors. A repeating under-sampling approach accompanied by a two-sample test stabilizes the sampling and returns a collection of the resultant p-values. In the second stage, the entropy of these sets is sequentially calculated, such that the change points of the time series obtained in this way indicate the changes in the newspaper content. Numerical experiments provided on the following issues of several Arabic newspapers published in the Arab Spring period demonstrate the high reliability of the method.

Highlights

  • The mass media have a powerful influence on modern society due to their close association with the so-called “mediated culture”

  • Many traditional techniques in the text mining area are associated with vector representations of a text as vectors of terms’ occurrences

  • The model is based on the word embedding methodology and a statistical procedure intended to recognize significant changes in the social state via changes in the linguistic performers of the considered media

Read more

Summary

Introduction

The mass media have a powerful influence on modern society due to their close association with the so-called “mediated culture”. Propagandistic purposes can be achieved by means of a language adaption of electronic and print media following the characteristics of the target audience taking into account the level of education, political preferences, customs and traditions, gender, and linguistic (dialectical) features For these reasons, it is natural to suppose that changes in the media language content may expose changes in social status. The collection of all similarities (in our application, the well-known “cos-similarities”) of the appropriately chosen adjacent vectors typifies the consistent linguistic template of the issue, and a change in distributions of these issue-based samples indicates a difference in the underlying newspaper template. To implement this concept, a two-step procedure is proposed.

Background
Evolutionary Model of the Publishing Process
Material
Parameters’ Selection
Results
Findings
Conclusions and Discussion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.