Detecting DeFi securities violations from token smart contract code

Arianna Trozze,Arianna Trozze,Bennett Kleinberg,Bennett Kleinberg,Toby Davies,Toby Davies

doi:10.1186/s40854-023-00572-5

Abstract

Decentralized Finance (DeFi) is a system of financial products and services built and delivered through smart contracts on various blockchains. In recent years, DeFi has gained popularity and market capitalization. However, it has also been connected to crime, particularly various types of securities violations. The lack of Know Your Customer requirements in DeFi poses challenges for governments trying to mitigate potential offenses. This study aims to determine whether this problem is suited to a machine learning approach, namely, whether we can identify DeFi projects potentially engaging in securities violations based on their tokens’ smart contract code. We adapted prior works on detecting specific types of securities violations across Ethereum by building classifiers based on features extracted from DeFi projects’ tokens’ smart contract code (specifically, opcode-based features). Our final model was a random forest model that achieved an 80% F-1 score against a baseline of 50%. Notably, we further explored the code-based features that are the most important to our model’s performance in more detail by analyzing tokens’ Solidity code and conducting cosine similarity analyses. We found that one element of the code that our opcode-based features can capture is the implementation of the SafeMath library, although this does not account for the entirety of our features. Another contribution of our study is a new dataset, comprising (a) a verified ground truth dataset for tokens involved in securities violations and (b) a set of legitimate tokens from a reputable DeFi aggregator. This paper further discusses the potential use of a model like ours by prosecutors in enforcement efforts and connects it to a wider legal context.

Full Text