Abstract
Cryptocurrencies gain trust in users by publicly disclosing the full creation and transaction history. In return, the transaction history faithfully records the whole spectrum of cryptocurrency user behaviors. This article analyzes and summarizes the existing research on knowledge discovery in the cryptocurrency transactions using data mining techniques. Specifically, we classify the existing research into three aspects, i.e., transaction tracings and blockchain address linking, the analyses of collective user behaviors, and the study of individual user behaviors. For each aspect, we present the problems, summarize the methodologies, and discuss major findings in the literature. Furthermore, an enumeration of transaction data parsing and visualization tools and services is also provided. Finally, we outline several gaps and trends for future investigation in this research area.
Highlights
As of 2020, more than 7000 cryptocurrencies are actively trading in more than 20 000 online exchanges
We provide a summary of transaction analytical and visualizing tools
As of 2020, the total number of unique Bitcoin addresses appearing in the transactions was more than 600 million [82], and that of Ethereum over 100 million [3], with a daily increase of hundreds of thousands [2]
Summary
As of 2020, more than 7000 cryptocurrencies are actively trading in more than 20 000 online exchanges. Except for the most commonly used multiple inputs and coin change rules, other heuristics for Bitcoin transactions may consider that all the output in a coinbase transaction belong to the same entity [44] or exploit specific transaction patterns, e.g., apparent self-transferring operations and those that resemble money laundering activities in a conventional banking system, to associate addresses [45]. Considering that users would deposit and withdraw the same amount of cryptocurrency to and from the mixing service, a widely adopted method to find input-output pairs is to find matched values or value combinations in the multipleinput-multiple-output transactions [41], [48], [49] This problem can be related to the classical subset sum problem. Yu et al [46] were able to identify the real coin being spent in 71% Monero inputs, 74% Bytecoin inputs, and in 92% DigitalNote inputs, using the zero-mix rule and their ‘‘closed set’’ attack
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have