Machine learning in classifying bitcoin addresses

Leonid Garin,Vladimir Gisin

doi:10.1016/j.jfds.2023.100109

Abstract

The emergence of the Bitcoin cryptocurrency marked a new era of illegal transactions. Cryptocurrency provides some level of anonymity allowing its users to create an unlimited number of wallets with alias addresses, which makes it challenging to identify the actual user. This is used by criminals for the purpose of making illegal transactions. At the same time, Bitcoin stores and provides information about all committed transactions, which opens up opportunities for identifying suspicious behavior patterns in this network using data mining. The problem of detecting suspicious activity in the Bitcoin network can be solved with sufficiently high accuracy using machine learning methods. The paper provides a comparative study of various machine learning methods to solve the mentioned problem: logistic regression, decision tree, random forest, gradient boosting. Selecting hyper parameters, rebalancing the dataset, and active learning are particularly important. The most important hyperparameters of the algorithms are described. Metrics show that the gradient boosting looks the most promising. In total 38 features of bitcoin addresses were identified. The top features are presented in the paper.

Full Text