Opinion Spam Detection based on Annotation Extension and Neural Networks

Yuanchao Liu,Bo Pang

doi:10.5539/cis.v12n2p87

Abstract

Online reviews play an increasingly important role in the purchase decisions of potential customers. Incidentally, driven by the desire to gain profit or publicity, spammers may be hired to write fake reviews and promote or demote the reputation of products or services. Correspondingly, opinion spam detection has attracted attention from both business and research communities in recent years. However, unlike other tasks such as news classification or blog classification, the existing review spam datasets are typically limited due to the expensiveness of human annotation, which may further affect detection performance even if excellent classifiers have been developed. We propose a novel approach in this paper to boost opinion spam detection performance by fully utilizing the existing labelled small-size dataset. We first design an annotation extension scheme that uses extra tree classifiers to train multiple estimators and then iteratively generate reliable labelled samples from unlabeled ones. Subsequently, we examine neural network scenarios on a newly extended dataset to learn the distributed representation. Experimental results suggest that the proposed approach has better generalization capability and improved performance than state-of-the-art methods.

Highlights

Product reviews have played an increasingly important role in the purchase decisions of potential customers
We present a brief review of the related work mainly from three perspectives, namely, deceptive opinion spam detection, semi-supervised self-labelled techniques, and neural networks for learning the distributed representation
This finding suggests that using a typical small amount of labelled spam data which are readily available but expensive to obtain with a large amount of unlabelled data for training is feasible in opinion spam detection

Summary

Introduction

Product reviews have played an increasingly important role in the purchase decisions of potential customers. We propose a semi-supervised ensemble learning-based annotation extension scheme that trains multiple estimators to extend spam review label set with reliable unlabeled samples. We conduct several experiments by training state-of-the-art neural network models on extended dataset to examine the extension classification performance. Neural networks have been proven highly effective for text classification tasks, and large-size datasets are usually preferable for these models to achieve better performance. The main contributions of this study are listed as follows: We propose a semi-supervised self-training based annotation extension scheme that trains multiple classifiers to extend the existing spam review label set from unlabeled samples. Neural networks have been proven highly effective for text classification tasks, and large-size datasets are usually preferred for these models to enhance the performance.

Deceptive Opinion Spam Detection

Semi-supervised Self-Labelled Techniques

Distributed Representation Learning

Overview of the Approach

Annotation Extension

4: For each unlabelled review u in U

Reliability Score

Feature Encoding

Neural Models

BiLSTM-RNN

TextCNN

Experiment Setup

Results and Analysis

Initial Self-learning Classifiers

Conclusions and Future Work

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Opinion Spam Detection based on Annotation Extension and Neural Networks

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Computer and Information Science

Lead the way for us

Journal: Computer and Information Science	Publication Date: Apr 5, 2019
License type: CC BY 4.0

Similar Papers

A Study on Diverse Methods and Performance Measures in Sentiment Analysis
Meesala Shobha Rani ... Subramanian Sumathy
Recent Patents on Engineering | VOL. 16
Meesala Shobha Rani, et. al.Meesala Shobha Rani ... Subramanian Sumathy
01 May 2022
Recent Patents on Engineering | VOL. 16

Opinion spam detection framework using hybrid classification scheme
Muhammad Zubair Asghar ... Shakeel Ahmad
Soft Computing | VOL. 24
Muhammad Zubair Asghar, et. al.Muhammad Zubair Asghar ... Shakeel Ahmad
11 Jun 2019
Soft Computing | VOL. 24

Pengaruh Produk, Harga, dan Kepuasan Konsumen terhadap Keputusan Pembelian Konsumen di Toko Projeksus
Michael Larosa ... Wirda Lilia
Ekonomis: Journal of Economics and Business | VOL. 5
Michael Larosa, et. al.Michael Larosa ... Wirda Lilia
20 Mar 2021
Ekonomis: Journal of Economics and Business | VOL. 5

Does brand personality mediate the link between social media usage and customer buying decisions on telecommunication’s products and services? Evidence from Ghana
Cleophas Attor ... Abdul Bashiru Jibril
Innovative Marketing | VOL. 18
Cleophas Attor, et. al.Cleophas Attor ... Abdul Bashiru Jibril
16 Aug 2022
Innovative Marketing | VOL. 18

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Opinion Spam Detection based on Annotation Extension and Neural Networks

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Computer and Information Science