Abstract

In order to combat information operations (IO) and disinformation campaigns, one must look at the behaviors of the accounts pushing specific narratives and stories through social media, not at the content itself. In this work, we present a new process for extracting tweet storms and uncovering networks of accounts that are working in a coordinated fashion using ridge count thresholding (RCT). To do this, we started with a dataset of 60 million individual tweets from the early weeks of the Covid-19 pandemic. Coherent topics are extracted from this data by testing three different preprocessing pipelines and applying Orthogonal Nonnegative Matrix Factorization (ONMF). The most effective preprocessing pipeline used hashtag preclustering to downselect the total dataset to the 7 million tweets that included the top hashtags. Each topic identified by ONMF is described by a topic-tweet signal, crafted using the time stamp included in each tweet’s metadata. These signals were broken down into tweet storms using RCT, which is calculated from the Dynamic Wavelet Fingerprint transform of each topic-tweet signal. Each tweet storm described a time of increased activity around a topic. Tweet storms identified in this way each represent some behavior in the underlying network. In total, we identified 39,817 total tweet storms that included about 2 million unique tweets. These tweet storms were used to identify networks of accounts that commonly co-occur within tweet storms to isolate those communities most responsible for driving narratives and pushing stories through social media. Through this process, we were able to identify 22 unique networks of accounts that were densely connected based on RCT tweet storm identification. Many of the identified networks exhibit obvious inauthentic behaviors that are potentially a part of an IO campaign.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call