Experimental Evaluation of Possible Feature Combinations for the Detection of Fraudulent Online Shops

Audronė Janavičiūtė,Agnius Liutkevičius,Nerijus Morkevičius,Gedas Dabužinskas

doi:10.3390/app14020919

Abstract

Online shopping has become a common and popular form of shopping, so online attackers try to extract money from customers by creating online shops whose purpose is to compel the buyer to disclose credit card details or to pay money for goods that are never delivered. Existing buyer protection methods are based on the analysis of the content of the online shop, customer reviews, the URL (Uniform Resource Locator) of the website, the search in blacklists or whitelists, or the combination of the above-mentioned methods. This study aims to find the minimal set of publicly and easily obtainable features to create high-precision classification solutions that require little computing and memory resources. We evaluate various combinations of 18 features that belong to three possible categories, namely URL-based, content-based, and third-party services-based. For this purpose, the custom dataset is created, and several machine learning models are applied for the detection of fraudulent online shops based on these combinations of features. The results of this study show that even only four of the most significant features allow one to achieve 0.9342 classification accuracy, while 0.9605 accuracy is reached with seven features, and the best accuracy of 0.9693 is achieved using thirteen and fifteen features.

Full Text