Abstract

Exploitability prediction has become increasingly important in cybersecurity, as the number of disclosed software vulnerabilities and exploits are soaring. Recently, machine learning and deep learning algorithms, including Support Vector Machine (SVM), Decision Tree, deep Neural Networks and their ensemble models, have achieved great success in vulnerability evaluation and exploitability prediction. However, they make a strong assumption that the data distribution is static over time and therefore fail to consider the concept drift problems due to the evolving system behaviours. In this work, we propose a novel consecutive batch learning algorithm, called Real-time Dynamic Concept Adaptive Learning (RDCAL), to deal with the concept drift and dynamic class imbalance problems existing in exploitability prediction. Specifically, we develop a Class Rectification Strategy (CRS) to handle the ‘actual drift’ in sample labels and a Balanced Window Strategy (BWS) to boost the minority class during real-time learning. Experimental results conducted on the real-world vulnerabilities collected between 1988 to 2020 show that the overall performance of classifiers, including Neural Networks, SVM, HoeffdingTree and Logistic Regression (LR), improves over 3% by adopting our proposed RDCAL algorithm. Furthermore, RDCAL achieves state-of-the-art performance on exploitability prediction compared with other concept drift algorithms.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call