Abstract

With the prevalence of mobile e-commerce, fraudulent transactions conducted by robots are becoming increasingly common in mobile payments, which is severely undermining market fairness and resulting in financial losses. It has become a difficult problem for mobile applications to identify robotic automation accurately and efficiently from a massive number of transactions. The current research does not propose any effective method or engineering implementation. In this article, an extension to boost algorithms is presented that permits the incorporation of prior human knowledge as a means of compensating for a training data shortage and improving prediction results. Prior human knowledge is accumulated from historical fraud transactions or transferred from different domains in the form of expert rules and blacklists. The knowledge is applied to extract risk features from transaction data, risk features together with normal features are input into the boosting algorithm to perform training, and therefore we incorporate boosting algorithm with prior human knowledge to improve the performance of the model. For the first time we verified the effectiveness of the method via a widely deployed mobile APP with 150+ million users, and by taking experiments on a certain dataset, the extended boosting model shows an accuracy increase from 0.9825 to 0.9871 and a recall rate increase from 0.888 to 0.948. We also investigated feature differences between robots and normal users and we discovered the behavior patterns of robotic automation that include less spatial motion detected by device sensors (1/10 of normal user pattern), higher IP group-clustering ratio (60% in robots vs. 15% in normal users), higher jailbroken device rate (92.47% vs. 4.64%), more irregular device names and fewer IP address changes. The quantitative analysis result is helpful for APP developers and service providers to understand and prevent fraudulent transactions from robotic automation.This article proposed an optimized boosting model, which has better use in the field of robotic automation detection of mobile phones. By combining prior knowledge and feature importance analysis, the model is more robust when the actual dataset is unbalanced or with few-short samples. The model is also more explainable as feature analysis is available which can be used for generating disposal rules in the actual fake mobile user blocking systems.

Highlights

  • Mobile e-commerce has developed rapidly in recent years, and the number of mobile transactions in China reached 101 billion accounting for 347 trillion yuan in 2019, increasing by 67.57% and 25.13%, respectively [1]

  • Massive human knowledge is accumulated in daily enterprise system operation, most machine learning models do not allow for the direct incorporation of prior knowledge

  • It becomes a difficult problem in the contexts of technology and engineering for mobile payment applications to distinguish robotic automation from normal user operations

Read more

Summary

Introduction

Mobile e-commerce has developed rapidly in recent years, and the number of mobile transactions in China reached 101 billion accounting for 347 trillion yuan in 2019, increasing by 67.57% and 25.13%, respectively [1]. Massive human knowledge is accumulated in daily enterprise system operation, most machine learning models do not allow for the direct incorporation of prior knowledge. It becomes a difficult problem in the contexts of technology and engineering for mobile payment applications to distinguish robotic automation from normal user operations. We propose an extension to boosting algorithms that combine human knowledge with training data in fraud detection of mobile payment transactions. APP under user authorization, extract different features from the raw data, label 31,500 payment transactions in 14 days as datasets, and train and test the extended model with the datasets.

Related Works
Verification Code Technology
Short Message Verification Technology
Biometric Identification Technology
Expert Rule Judgment
Data and Features
Collect Data on Mobile Terminals
Static Data
Slowly Changing Data
Dynamic Data
Extract Normal Features from Raw Data
Continuous Features
Extract Risk Features by Incorporating Prior Knowledge
1: Initialize RiskValue with 0
Machine Learning Model
Boosting Machine Learning Model
Extension to the Boosting Model by Incorporating Prior Knowledge
Label the Dataset
Train and Test Boosting Model
Train and Test Extended Boosting Model
Behavior Patterns of Robotic Automation
Device Movement Pattern
IP Address Group-Clustering Pattern
Device Naming Pattern
IP Address Change Pattern
Discussion and Future
Methods
Findings
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call