Abstract
Intuitively, image- or video-based recommendations seem to be more reliable than those containing plain text, and these types of recommendations have recently become widely encouraged and commonly seen across opinion sharing platforms. Considering their potential for manipulation, graphs (e.g., images and videos) are more vulnerable to spam than scripts. However, most state-of-the-art solutions for opinion spam detection are exclusively devoted to natural language parsing, and less work has been done concerning photos or videos. After investigating the top two business-to-customer websites, i.e., JD.com and TMALL.com, we propose an unsupervised approach to label suspected spam based on different types of duplication across images, videos and Chinese texts. Experiments verified the effectiveness of this approach and obtained several conclusions: 1) the situation of image spam is more severe than that of video and text spam; 2) for manipulation, borrowing something from a marketing page is less attractive than stealing from other reviewers; 3) in addition to using identical texts, spammers also use fictitious rare incidents to influence customers; and 4) overlapping duplications of images, videos and texts are common.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have