Abstract

The recent advances in language modeling significantly improved the generative capabilities of deep neural models: in 2019 OpenAI released GPT-2, a pre-trained language model that can autonomously generate coherent, non-trivial and human-like text samples. Since then, ever more powerful text generative models have been developed. Adversaries can exploit these tremendous generative capabilities to enhance social bots that will have the ability to write plausible deepfake messages, hoping to contaminate public debate. To prevent this, it is crucial to develop deepfake social media messages detection systems. However, to the best of our knowledge no one has ever addressed the detection of machine-generated texts on social networks like Twitter or Facebook. With the aim of helping the research in this detection field, we collected the first dataset of real deepfake tweets, TweepFake. It is real in the sense that each deepfake tweet was actually posted on Twitter. We collected tweets from a total of 23 bots, imitating 17 human accounts. The bots are based on various generation techniques, i.e., Markov Chains, RNN, RNN+Markov, LSTM, GPT-2. We also randomly selected tweets from the humans imitated by the bots to have an overall balanced dataset of 25,572 tweets (half human and half bots generated). The dataset is publicly available on Kaggle. Lastly, we evaluated 13 deepfake text detection methods (based on various state-of-the-art approaches) to both demonstrate the challenges that Tweepfake poses and create a solid baseline of detection techniques. We hope that TweepFake can offer the opportunity to tackle the deepfake detection on social media messages as well.

Highlights

  • With the aim of showing the TweepFake challenges that TweepFake poses and providing a solid baseline of detection techniques, we evaluated 13 different deepfake text detection methods: some of them exploiting text representations as inputs to machine-learning classifiers, others based on deep learning networks, and others relying on the fine-tuning of transformer-based classifiers

  • On this last point it is interesting to note that all complex fine tuned LM methods perform remarkably worst than some character based methods like CHAR_GRU. This could indicate that RNN networks maintain slight advantages in temporal representations for short contexts respect to newer transformer networks, an important aspect to be investigated in the future. These findings suggest that a wide variety of detectors have greater difficulties in detecting correctly a deepfake tweet rather than a human-written one; this is especially true for GPT-2 generated tweets, insinuating that the newest and more sophisticated generative methods based on the transformer architecture can produce more human-like short texts than old generative methods like RNNs

  • To the best of our knowledge no deepfake detection has been conducted over social media texts yet

Read more

Summary

Introduction

The social media platforms—developed to connect people and make them share their ideas and opinions through multimedia contents (like images, video, audio, and texts)—have been used to manipulate and alter the public opinion thanks to bots, i.e., computer programs that control a fake social media account as a legitimate human user would do: by “liking”, sharing and posting old or new media which could be real, forged through simple techniques (e.g., editing of a video, use of gap-filling texts and search-and-replace methods) or deepfake.

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.