Abstract

The usefulness of user-generated online reviews is hampered by fake reviews, often produced by clandestinely sponsored reviewers. Detecting fake reviews is a difficult task even for laypeople, and this has also been the case for previous automatic detection approaches, which have only had a limited success. Earlier studies showed that people who tell lies or write deceptive reviews tend to select words unnaturally. We propose a novel approach to detecting fake reviews by applying a topic modeling method based on Latent Dirichlet Allocation (LDA). A unique contribution of this paper is to explicate some latent aspects of fake and truthful reviews by means of "topics" that are not necessarily subject areas but related to the word choice patterns reflecting behavioral and linguistic characteristics of the fake review writers. We constructed a labeled dataset based on Yelp and demonstrated that the proposed approach helps identifying unique aspects of fake and truthful reviews, which has a potential to improving the performance of the fake review detection task. The experimental result shows that our proposed method yields better performance than that of state-of-the-art methods for small size categories in our dataset.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call