Opinion spam detection is a challenge for online review systems and social forum operators. Opinion spamming costs businesses and people money since it deceives customers as well as automated opinion mining and sentiment analysis systems by bestowing undeserved positive opinions on target firms and/or bestowing fake negative opinions on others. One popular detection approach is to model a review system as a network of users, products, and reviews, for example using review graph models. In this article, we study the effects of network scale on network-based review spammer detection models, specifically on the trust model and the SpammerRank model. We then evaluate both network models using two large publicly available review datasets, namely: the Amazon dataset (containing 6 million reviews by more than 2 million reviewers) and the UCSD dataset (containing over 82 million reviews by 21 million reviewers). It has been observed thatSpammerRank model provides a better scaling time for applications requiring reviewer indicators and in case of trust model distributions are flattening out indicating variance of reviews with respect to spamming. Detailed observations on the scaling effects of these models are reported in the result section.
Read full abstract