Abstract

The task of detecting fraudulent reviewers is of great importance to E-commerce platforms. Existing research has invested much effort into developing comprehensive features and advanced techniques to detect fraudulent reviewers. However, most of these studies have ignored the data imbalance problem inherent in fraudulent reviewer detection: non-fraudulent reviewers are the majority, while fraudulent reviewers are the minority in real practice. To fill this gap, we propose a novel approach called ImDetector to detect fraudulent reviewers while handling data imbalance based on weighted latent Dirichlet allocation (LDA) and Kullback–Leibler (KL) divergence. Specifically, we develop a weighted LDA model to extract the latent topics of reviewers distributed on the review features. Asymmetric KL divergence is adopted to make the similarity measure between reviewers biased toward the fraudulent minority when using the K-nearest-neighbor for classification. By mapping the reviewers to the latent topics of features derived from the weighted LDA model and measuring the similarities between reviewers using asymmetric KL divergence, the data imbalance problem in fraudulent reviewer detection is alleviated. Extensive experiments on the Yelp.com dataset demonstrate that the proposed ImDetector approach is superior to the state-of-the-art techniques used for fraudulent reviewer detection. We also explain the experimental results and present the managerial implications of this paper.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.