Abstract

As a typical representative of web 2.0, Ethereum has significantly boosted the development of blockchain finance. However, due to the anonymity and financial attributes of Ethereum, the number of phishing scams is increasing rapidly and causing massive losses, which poses a serious threat to blockchain financial security. Phishing scam address identification enables to detect phishing scam addresses and alerts users to reduce losses. However, there are three primary challenges in phishing scam address recognition task: 1) the lack of publicly available large datasets of phishing scam address transactions; 2) the use of multi-order transaction information requires a large number of queries and computations; and 3) the extraction of phishing scam address features relies on machine learning methods excessively, which leads to the loss of practical meaning and is harmful to the research of phishing scam addresses. This paper proposes a systematic phishing scam address recognition scheme, to simultaneously overcome the three challenges in phishing scam address recognition. In this paper, a systematic phishing scam address recognition scheme is proposed to addresses these issues. Specifically, due to the insufficient number of address tagged in the existing publicly available Ethereum phishing scam address transaction dataset, we first construct a transaction dataset involving over 10000 tagged addresses. To the best of our knowledge, this is the largest dataset of tagged addresses for Ethereum phishing scam detection. Then, we design a new heuristic rule to implement feature extraction of address nodes by analysing the traditional financial involved accounts combined with information specific to Ethernet transactions. After that, a novel adaptive feature importance filtering method is designed to adaptively adjust the filtering threshold based on the final classification results, which reduce the feature dimensionality while ensuring a certain detection performance. Finally, random forest is used to classify whether the addresses is a phishing scam address or not. Extensive experiments on real Ethereum datasets show that our approach (98.89% Precision, 98.35% Recall, 98.62% F1) achieves state-of-the-art performance.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call