Abstract
Future-oriented networking infrastructures are characterized by highly dynamic Streaming Data (SD) whose volume, speed and number of dimensions increased significantly over the past couple of years, energized by trends such as Software-Defined Networking or Artificial Intelligence. As an essential core component of network security, Intrusion Detection Systems (IDS) help to uncover malicious activity. In particular, consecutively applied alert correlation methods can aid in mining attack patterns based on the alerts generated by IDS. However, most of the existing methods lack the functionality to deal with SD data affected by the phenomenon called concept drift and are mainly designed to operate on the output from signature-based IDS. Although unsupervised Outlier Detection (OD) methods have the ability to detect yet unknown attacks, most of the alert correlation methods cannot handle the outcome of such anomaly-based IDS. In this paper, we introduce a novel framework called Streaming Outlier Analysis and Attack Pattern Recognition, denoted as SOAAPR, which is able to process the output of various online unsupervised OD methods in a streaming fashion to extract information about novel attack patterns. Three different privacy-preserving, fingerprint-like signatures are computed from the clustered set of correlated alerts by SOAAPR, which characterizes and represents the potential attack scenarios with respect to their communication relations, their manifestation in the data’s features and their temporal behavior. Beyond the recognition of known attacks, comparing derived signatures, they can be leveraged to find similarities between yet unknown and novel attack patterns. The evaluation, which is split into two parts, takes advantage of attack scenarios from the widely-used and popular CICIDS2017 and CSE-CIC-IDS2018 datasets. Firstly, the streaming alert correlation capability is evaluated on CICIDS2017 and compared to a state-of-the-art offline algorithm, called Graph-based Alert Correlation (GAC), which has the potential to deal with the outcome of anomaly-based IDS. Secondly, the three types of signatures are computed from attack scenarios in the datasets and compared to each other. The discussion of results, on the one hand, shows that SOAAPR can compete with GAC in terms of alert correlation capability leveraging four different metrics and outperforms it significantly in terms of processing time by an average factor of 70 in 11 attack scenarios. On the other hand, in most cases, all three types of signatures seem to reliably characterize attack scenarios such that similar ones are grouped together, with up to 99.05% similarity between the FTP and SSH Patator attack.
Highlights
In recent times, trends and technologies, such as Internet of Things, Software-DefinedEverything and Artificial Intelligence, have accelerated the increasing interconnection of networking devices
In order to achieve streaming clustering for our purposes, we extend each cluster Ci with additional properties beyond the simple subset of alerts, which we deem mandatory to answer the following fundamental questions: Firstly, when is a cluster saturated, i.e., when is it ready for the process of signature generation? Secondly, when is a cluster with its alerts considered outdated and should be discarded? To answer those two questions, we refer to Table 2 and define two significant user-definable parameters: a maximum total time to live for a cluster, denoted as tttl, and a minimum number of alerts that should be clustered in order to reasonably represent an attack scenario for signature derivation, denoted as min_alerts
Jaccard index—will provide the similarity of the ideal cluster and Ci obtained by the alert correlation system
Summary
Trends and technologies, such as Internet of Things, Software-DefinedEverything and Artificial Intelligence, have accelerated the increasing interconnection of networking devices. The growth of Machine Learning (ML) has led to a boost in anomaly-based detection methods that create a model of trusted activity from a set of collected data samples and identify malicious activity by analyzing behavior deviations This type of method is predestined to detect novel, yet unknown, attack patterns without requiring a priori knowledge, they are accompanied by a high ratio of False Positives (FPs) and False Negatives (FNs), which limits their utilization in real-world scenarios. Alert correlation is a common practice with aims such as false alert reduction, attack pattern identification, root cause detection of attacks or prediction of future attack steps by processing alerts from the heterogeneously applied IDS They aid to reduce the sheer unmanageable number of events (alarm flooding) generated, especially with the continuously growing amount of high-dimensional data, which can no longer be handled by human analysts. Attacks detected by misuse-based IDS are denoted as the intrusion type or class
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.