Detection of Opinion Spam with Character n-grams

Donato Hernández Fusilier,Rafael Guzmán Cabrera,Manuel Montes-Y-Gómez,Paolo Rosso

doi:10.1007/978-3-319-18117-2_21

Abstract

In this paper we consider the detection of opinion spam as a stylistic classification task because, given a particular domain, the deceptive and truthful opinions are similar in content but differ in the way opinions are written (style). Particularly, we propose using character n-grams as features since they have shown to capture lexical content as well as stylistic information. We evaluated our approach on a standard corpus composed of 1600 hotel reviews, considering positive and negative reviews. We compared the results obtained with character n-grams against the ones with word n-grams. Moreover, we evaluated the effectiveness of character n-grams decreasing the training set size in order to simulate real training conditions. The results obtained show that character n-grams are good features for the detection of opinion spam; they seem to be able to capture better than word n-grams the content of deceptive opinions and the writing style of the deceiver. In particular, results show an improvement of 2.3% and 2.1% over the word-based representations in the detection of positive and negative deceptive opinions respectively. Furthermore, character n-grams allow to obtain a good performance also with a very small training corpus. Using only 25% of the training set, a Naïve Bayes classifier showed F 1 values up to 0.80 for both opinion polarities.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Detection of Opinion Spam with Character n-grams

Abstract

Talk to us

Similar Papers

Lead the way for us

Publication Date: Jan 1, 2015
Citations: 51	License type: other-oa

Similar Papers

Combining Word and Character N-Grams for Detecting Deceptive Opinions
Al Hafiz Akbar Maulana Siagian ... Masayoshi Aritsugi
-
Al Hafiz Akbar Maulana Siagian, et. al.Al Hafiz Akbar Maulana Siagian ... Masayoshi Aritsugi
01 Jul 2017
01 Jul 2017

Deceptive Opinion Spam based On Deep Learning
Fahfouh Anass ... Mohamed Adnane Mahraz
-
Fahfouh Anass, et. al.Fahfouh Anass ... Mohamed Adnane Mahraz
21 Oct 2020
21 Oct 2020

Impact of Behavioral and Textual Features on Opinion Spam Detection
Ajay Rastogi ... Monica Mehrotra
-
Ajay Rastogi, et. al.Ajay Rastogi ... Monica Mehrotra
01 Jun 2018
01 Jun 2018

Opinion spam detection framework using hybrid classification scheme
Muhammad Zubair Asghar ... Aurangzeb Khan
Soft Computing | VOL. 24
Muhammad Zubair Asghar, et. al.Muhammad Zubair Asghar ... Aurangzeb Khan
11 Jun 2019
Soft Computing | VOL. 24

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Detection of Opinion Spam with Character n-grams

Abstract

Talk to us

Similar Papers