Using deep neural networks to detect fake news has gradually become an extremely important field of research. Recent studies, which adopt fine-tuning BERT to detect fake news, have yielded promising results. Unfortunately, fine-tuning BERT is prone to overfitting the training data, which may reduce its generalization ability. The problem remains challenging when one tries to generalize the detector to unseen fake news. First, we investigate the effects of several fine-tuning strategies on the generalization of BERT models used in fake news detection. Specifically, we explore three fine-tuning strategies, which include initialization of model parameters, freezing of model parameters, and training iterations for fine-tuning. Besides, we propose an adversarial fine-tuning strategy to boost the generalization of fine-tuning BERT for fake news detection. In detail, we define the BERT fine-tuning procedure as a mini-max optimization problem. We also propose a novel adversarial fine-tuning training strategy based on feature regularization, called FGM-FRAT, to further improve the generalization. Specifically, we propose to adopt the model representation to guide the generation of adversarial examples during adversarial training on the word vector space. The proposed method outperforms state-of-the-art fake news detection methods. Extensive experiments on several benchmark databases, which include FakeNewsNet, BuzzFeedNews, LIAR, WELFake, and KaggleFakeNews, show that using the proposed method improves the generalization accuracy from 61% to 73% for BERT. It indicates that our FGM-FRAT can greatly improve the generalization of fine-tuning BERT for fake news detection. Moreover, the proposed method also can be extended to other pre-trained language models and other text classification tasks.
Read full abstract