Still out there: Modeling and Identifying Russian Troll Accounts on Twitter

Jane Im,Ankit Bhargava,Jackson Sargent,Taylor Denby,Eric Gilbert,Eshwar Chandrasekharan,Libby Hemphill,David Jurgens,Paige Lighthammer

doi:10.1145/3394231.3397889

Abstract

There is evidence that Russia’s Internet Research Agency attempted to interfere with the 2016 U.S. election by running fake accounts on Twitter—often referred to as “Russian trolls”. In this work, we: 1) develop machine learning models that predict whether a Twitter account is a Russian troll within a set of 170K control accounts; and, 2) demonstrate that it is possible to use this model to find active accounts on Twitter still likely acting on behalf of the Russian state. Using both behavioral and linguistic features, we show that it is possible to distinguish between a troll and a non-troll with a precision of 78.5% and an AUC of 98.9%, under cross-validation. Applying the model to out-of-sample accounts still active today, we find that up to 2.6% of top journalists’ mentions are occupied by Russian trolls. These findings imply that the Russian trolls are very likely still active today. Additional analysis shows that they are not merely software-controlled bots, and manage their online identities in various complex ways. Finally, we argue that if it is possible to discover these accounts using externally-accessible data, then the platforms—with access to a variety of private internal signals—should succeed at similar or better rates.

Full Text