Abstract
Search rank fraud, the fraudulent promotion of products hosted on peer-review sites, is driven by expert workers recruited online, often from crowdsourcing sites. In this paper we introduce the <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">fraud de-anonymization</i> problem, that goes beyond fraud detection, to unmask the human masterminds responsible for posting search rank fraud in peer-review sites. We collect and study data from crowdsourced search rank fraud jobs, and survey the capabilities and behaviors of 58 search rank fraud workers recruited from 6 crowdsourcing sites. We collect a gold standard dataset of Google Play user accounts attributed to 23 crowdsourced workers and analyze their fraudulent behaviors in the wild. We propose <sc xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Dolos</small> , a fraud de-anonymization system that leverages traits and behaviors we extract from our studies, to attribute detected fraud to crowdsourcing site workers, thus to real identities and bank accounts. We introduce MCDense, a min-cut dense component detection algorithm to uncover groups of user accounts controlled by <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">different</i> workers, and use stylometry and supervised learning to attribute them to crowdsourcing site profiles. <sc xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Dolos</small> correctly identified the owners of 95 percent of fraud worker-controlled communities, and uncovered fraud workers who promoted as many as 97.5 percent of fraud apps we collected from Google Play. When evaluated on 13,087 apps (820,760 reviews), which we monitored over more than 6 months, <sc xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Dolos</small> identified 1,056 apps with suspicious reviewer groups. We report orthogonal evidence of their fraud, including fraud duplicates and fraud re-posts. <sc xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Dolos</small> significantly outperformed adapted dense subgraph detection and loopy belief propagation competitors, on two new coverage scores that measure the quality of detected community partitions.
Accepted Version
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: IEEE Transactions on Knowledge and Data Engineering
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.