Abstract

Ability to perform fast analysis on massive public blockchain transaction data is needed in various applications such as tracing fraudulent financial transactions. The blockchain data is continuously growing and is organized as a sequence of blocks containing transactions. This organization, however, cannot be used for parallel graph algorithms which need efficient distributed graph data structures. Using message passing libraries (MPI), we develop a scalable cluster-based system that constructs a distributed transaction graph in parallel and implement various transaction analysis algorithms. We report performance results from our system operating on roughly 5 years of 10.2 million block Ethereum Mainnet blockchain data. We report timings obtained from tests involving distributed transaction graph construction, partitioning, page ranking of addresses, degree distribution, token transaction counting, connected components finding and our new parallel blacklisted address trace forest computation algorithm on a 16 node economical cluster set up on the Amazon cloud. Our system is able to construct a distributed graph of 766 million transactions in 218 s and compute the forest of blacklisted address traces in 32 s.

Highlights

  • Public blockchain platforms that operate autonomously under the control of no one have become popular globally

  • Using message passing libraries (MPI), we develop a scalable cluster-based system that constructs a distributed transaction graph in parallel and implement various transaction analysis algorithms

  • We report timings obtained from tests involving distributed transaction graph construction, partitioning, page ranking of addresses, degree distribution, token transaction counting, connected components finding and our new parallel blacklisted address trace forest computation algorithm on a 16 node economical cluster set up on the Amazon cloud

Read more

Summary

Introduction

Public blockchain platforms that operate autonomously under the control of no one have become popular globally. A system that performs fast tracing fraudulent activities on massive public blockchain transaction data is needed in the field of finance. This need has led to the emergence of firms such as the Chainalysis [3] that is highly valued or the CipherTrace that has recently been acquired [4]. All these developments provide evidence that scalable and parallel systems will be needed that can analyze big blockchain graph transaction data in the near future This is the problem that is addressed in this paper.

Previous work
Blockchain graph system architecture
Distributed transaction graph construction
Distributed calculation of the blacklisted address trace forest
MPI processes per node
Discussion and conclusion
Office of Public Affairs
18. Rocket Team
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call