Abstract

Unknown vulnerabilities, also known as zero-day vulnerabilities, are vulnerabilities in software, systems, or networks that have not yet been publicly disclosed or fixed. If these vulnerabilities are ever discovered by hackers, intentionally or unintentionally, they pose a major threat to network security. This is particularly true in the blockchain field, as smart contracts hold a lot of money, and if they are discovered and exploited by hackers, the financial losses to users will be even greater. However, the current research on smart contract vulnerabilities mainly focuses on known vulnerabilities, and the research on unknown vulnerabilities has been limited. Based on this, we introduce a machine learning-based method for detecting unknown vulnerabilities in smart contracts. First, the method obtains the opcode sequences executed by smart contract transactions in the EVM by instrumenting Geth and replaying the Ethereum transactions. Next, we employ an n-gram model and a vector weight penalty mechanism to extract the opcode sequence features. We then use machine learning algorithms to detect unknown vulnerabilities based on the similarity principle. Finally, we test the effectiveness of our method with four machine learning models: the K-Nearest Neighbor algorithm (KNN), Support Vector Machine (SVM), Logistic Regression (LR), and Decision Tree (DT). The SVM model performs best at detecting unknown vulnerabilities, with an accuracy of 96%, a precision of 91%, a recall of 100%, and an F1-score of 95%. We also discuss the benefits of the method: timely detection of attacks due to unknown vulnerabilities, thus reducing user losses.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.