TweepFake: About detecting deepfake tweets.

Tiziano Fagni,Margherita Gambini,Maurizio Tesconi,Antonio Martella,Fabrizio Falchi

doi:10.1371/journal.pone.0251415

Abstract

The recent advances in language modeling significantly improved the generative capabilities of deep neural models: in 2019 OpenAI released GPT-2, a pre-trained language model that can autonomously generate coherent, non-trivial and human-like text samples. Since then, ever more powerful text generative models have been developed. Adversaries can exploit these tremendous generative capabilities to enhance social bots that will have the ability to write plausible deepfake messages, hoping to contaminate public debate. To prevent this, it is crucial to develop deepfake social media messages detection systems. However, to the best of our knowledge no one has ever addressed the detection of machine-generated texts on social networks like Twitter or Facebook. With the aim of helping the research in this detection field, we collected the first dataset of real deepfake tweets, TweepFake. It is real in the sense that each deepfake tweet was actually posted on Twitter. We collected tweets from a total of 23 bots, imitating 17 human accounts. The bots are based on various generation techniques, i.e., Markov Chains, RNN, RNN+Markov, LSTM, GPT-2. We also randomly selected tweets from the humans imitated by the bots to have an overall balanced dataset of 25,572 tweets (half human and half bots generated). The dataset is publicly available on Kaggle. Lastly, we evaluated 13 deepfake text detection methods (based on various state-of-the-art approaches) to both demonstrate the challenges that Tweepfake poses and create a solid baseline of detection techniques. We hope that TweepFake can offer the opportunity to tackle the deepfake detection on social media messages as well.

Highlights

With the aim of showing the TweepFake challenges that TweepFake poses and providing a solid baseline of detection techniques, we evaluated 13 different deepfake text detection methods: some of them exploiting text representations as inputs to machine-learning classifiers, others based on deep learning networks, and others relying on the fine-tuning of transformer-based classifiers
On this last point it is interesting to note that all complex fine tuned LM methods perform remarkably worst than some character based methods like CHAR_GRU. This could indicate that RNN networks maintain slight advantages in temporal representations for short contexts respect to newer transformer networks, an important aspect to be investigated in the future. These findings suggest that a wide variety of detectors have greater difficulties in detecting correctly a deepfake tweet rather than a human-written one; this is especially true for GPT-2 generated tweets, insinuating that the newest and more sophisticated generative methods based on the transformer architecture can produce more human-like short texts than old generative methods like RNNs
To the best of our knowledge no deepfake detection has been conducted over social media texts yet

Summary

Introduction

The social media platforms—developed to connect people and make them share their ideas and opinions through multimedia contents (like images, video, audio, and texts)—have been used to manipulate and alter the public opinion thanks to bots, i.e., computer programs that control a fake social media account as a legitimate human user would do: by “liking”, sharing and posting old or new media which could be real, forged through simple techniques (e.g., editing of a video, use of gap-filling texts and search-and-replace methods) or deepfake.

Objectives

Methods

Results

Discussion

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: PLOS ONE	Publication Date: May 13, 2021
Citations: 64	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

TweepFake: About detecting deepfake tweets.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PLOS ONE

Lead the way for us

Similar Papers

Neural Transfer Learning For Vietnamese Sentiment Analysis Using Pre-trained Contextual Language Models
An Pha Le ... Tran Vu Pham
-
An Pha Le, et. al.An Pha Le ... Tran Vu Pham
16 Dec 2021
16 Dec 2021

Towards an Enhanced Understanding of Bias in Pre-trained Neural Language Models: A Survey with Special Emphasis on Affective Bias
Anoop K ... Manjary P Gangan
-
Anoop K, et. al. Anoop K ... Manjary P Gangan
01 Jan 2021
01 Jan 2021

On the Power of Pre-Trained Text Representations
Yu Meng ... Jiaxin Huang
-
Yu Meng, et. al.Yu Meng ... Jiaxin Huang
14 Aug 2021
14 Aug 2021

Leveraging pre-trained language models for mining microbiome-disease relationships
Nikitha Karkera ... Sucheendra K Palaniappan
BMC Bioinformatics | VOL. 24
Nikitha Karkera, et. al.Nikitha Karkera ... Sucheendra K Palaniappan
19 Jul 2023
BMC Bioinformatics | VOL. 24

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

TweepFake: About detecting deepfake tweets.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PLOS ONE