Usage of machine learning methods for early detection of money laundering schemes

Jenny Domashova,Natalia Mikhailina

doi:10.1016/j.procs.2021.06.033

Abstract

The article presents the results of applying machine learning methods to identify organizations prone to money laundering. Methods of data preprocessing were analyzed: categorical features encoding, informative feature selection. Classification methods were studied, in particular, ensemble methods of machine learning, algorithms for selecting the optimal hyperparameters, and methods for assessing the quality of the model. The most significant anti-money laundering and combating the financing of terrorism (AML/CFT) features in suspicious organizations when opening a current account were determined. The use of combinations of different methods of transformation of categorical features with the type of cross-validation used in modeling of the mentioned task was explored. The expediency of using TargetEncoder with double cross-validation was demonstrated. A model for identifying organizations prone to money laundering is trained. The best prediction quality is achieved by using a gradient boosting algorithm over decision trees. The quality of hyperparameter selection using hyperopt and optuna Python libraries was studied, and the speed of obtaining the optimal set was estimated. The model can be used to form a list of the most important indicators for early detection of organizations involved in money laundering and terrorist financing (ML/TF), as well as to develop adequate recommendations to improve the compliance control process. A software tool in Python was developed that allows to solve the tasks of early detection of organizations prone to money laundering.

Full Text