BSTC: A Fake Review Detection Model Based on a Pre-Trained Language Model and Convolutional Neural Network

Junwen Lu,Xinrong Zhan,Guanfeng Liu,Xiaolong Deng,Xintao Zhan

doi:10.3390/electronics12102165

Junwen Lu, Xinrong Zhan + Show 3 more

Open Access

https://doi.org/10.3390/electronics12102165

Copy DOI

Abstract

Detecting fake reviews can help customers make better purchasing decisions and maintain a positive online business environment. In recent years, pre-trained language models have significantly improved the performance of natural language processing tasks. These models are able to generate different representation vectors for each word in different contexts, thus solving the challenge of multiple meanings of a word, which traditional word vector methods such as Word2Vec cannot solve, and, therefore, better capturing the text’s contextual information. In addition, we consider that reviews generally contain rich opinion and sentiment expressions, while most pre-trained language models, including BERT, lack the consideration of sentiment knowledge in the pre-training stage. Based on the above considerations, we propose a new fake review detection model based on a pre-trained language model and convolutional neural network, which is called BSTC. BSTC considers BERT, SKEP, and TextCNN, where SKEP is a pre-trained language model based on sentiment knowledge enhancement. We conducted a series of experiments on three gold-standard datasets, and the findings illustrate that BSTC outperforms state-of-the-art methods in detecting fake reviews. It achieved the highest accuracy on all three gold-standard datasets—Hotel, Restaurant, and Doctor—with 93.44%, 91.25%, and 92.86%, respectively.

Full Text