Abstract

The article presents the results of applying machine learning methods to identify organizations prone to money laundering. Methods of data preprocessing were analyzed: categorical features encoding, informative feature selection. Classification methods were studied, in particular, ensemble methods of machine learning, algorithms for selecting the optimal hyperparameters, and methods for assessing the quality of the model. The most significant anti-money laundering and combating the financing of terrorism (AML/CFT) features in suspicious organizations when opening a current account were determined. The use of combinations of different methods of transformation of categorical features with the type of cross-validation used in modeling of the mentioned task was explored. The expediency of using TargetEncoder with double cross-validation was demonstrated. A model for identifying organizations prone to money laundering is trained. The best prediction quality is achieved by using a gradient boosting algorithm over decision trees. The quality of hyperparameter selection using hyperopt and optuna Python libraries was studied, and the speed of obtaining the optimal set was estimated. The model can be used to form a list of the most important indicators for early detection of organizations involved in money laundering and terrorist financing (ML/TF), as well as to develop adequate recommendations to improve the compliance control process. A software tool in Python was developed that allows to solve the tasks of early detection of organizations prone to money laundering.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.