Abstract

With the widespread of fake online reviews, the detection of fake reviews has become a hot research issue. Despite the efforts of existing studies on fake review detection, the issues of imbalanced data and feature pruning still lack sufficient attention. To address these gaps, the present study proposes an ensemble model for the detection of fake online reviews. The model consists of four steps, and the first three steps are proposed to optimize the base classifiers: (i) Data resampling: We propose a novel way to address the data imbalance problem by combining the resampling and the grid search technique. (ii) Feature pruning: We propose an ablation study to drop unimportant features. (iii) Parameters optimization: We apply the grid search algorithm to determine suitable values of the relevant parameters for each base classifier. (iv) Classifier ensembling: We apply majority voting and stacking strategies to integrate the optimized base classifiers. The proposed data resampling method is also applied for the meta-classifier in the stacking ensemble model. This study produces advances in terms of combining different methods or algorithms into a model and the results show that the proposed ensemble model outperforms some existing techniques, thereby providing a new way to solve the data imbalance and feature pruning issues in the field of fake review detection.

Highlights

  • With the development of e-commerce, online shopping is becoming increasingly prevalent

  • The present study contributes to the literature by providing a different way to effectively detect fake reviews. (i) we initially look at a very novel approach by combining data resampling with the grid search method to address the data imbalance problem, as this can effectively improve the performance of the model to a large extent on an imbalanced dataset

  • We use the F1-score as the metric to evaluate the performances of the classifiers, and the fake reviews are set as positive samples to ensure that the F1-score reflects the performance in terms of detecting fake reviews

Read more

Summary

INTRODUCTION

With the development of e-commerce, online shopping is becoming increasingly prevalent. Ensemble strategies still have not been sufficiently considered To address these gaps, the current study proposes an ensemble model for fake review detection. The ensemble model can reduce the relative weaknesses of single classifiers as they are compensated by the advantages of other classifiers [20] This approach has been proven to be effective in improving the performance of classification [21], [22]. To the best of our knowledge, few studies on fake review detection have combined these processes in a model. (i) we initially look at a very novel approach by combining data resampling with the grid search method to address the data imbalance problem, as this can effectively improve the performance of the model to a large extent on an imbalanced dataset.

LITERATURE REVIEW
RESULTS AND ANALYSIS
CONCLUSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call