Supervised PU Learning for Cyber Security Event Prioritization

Wang-Yan Feng,Shu-Ning Wu

doi:10.12783/dtcse/cnsce2017/8879

Abstract

To ensure the cyber security of an enterprise, typically a SIEM (Security Information and Event Management) system is in place to alert security events and assign each of them a severity score based on some pre-determined rules. Analysts in the security operations center (SOC) investigate the high severity events to decide if they are truly malicious or not. However, generally, the number of events is overwhelmingly large, far exceeding the SOCâ€™s budget to handle them. There is a great need for a machine learning system to assist for the accurate detection of malicious events. Traditional supervised learning algorithms cannot be directly applied to this problem because there are only a very small percentage of events that are verified and labelled malicious events (called positively labelled) by SOC analysts and vast majority of events are unlabeled as they are not even investigated. In this paper, we propose to aggregate security events to host level and use supervised PU (Positive Unlabeled) learning technique to accurately detect the high risk hosts. We use Support Vector Machine with radial basis kernel for label propagation and classification and achieve a high classification accuracy of AUC (area under the curve) of 0.96 and lift of 18 relative to the current rule-based alerting mechanism.

Full Text