Abstract

The Border Gateway Protocol (BGP) is the default Internet routing protocol that manages connectivity among Autonomous Systems (ASes). Although BGP disruptions are rare, when they occur the consequences can be very damaging. Due to the importance of this problem there has been a considerable effort aimed at understanding what is normal and abnormal BGP traffic and, in doing so, enable potentially disruptive anomalous traffic to be quickly identified and mitigate its effects. Recent efforts indicate that machine learning (ML) techniques are capable of achieving a high level of accuracy and robustness in anomaly detection. However, even though there are public datasets of BGP messages, these are not suited to feed ML models directly, because the goal of these messages is to exchange fine-grained reachability information instead of observe network behavior. Thus, in order to apply ML methods to BGP control plane data, the data must be preprocessed and features extracted to a format that is suited to be used as input to ML models (e.g. labelled tabular data). In this work, we implemented a dataset generation tool that extracts relevant features from BGP control plane messages along with tools that assist the labelling of the anomaly period. Our tool extracts volume and AS path features most commonly used by anomaly detection techniques, as well as novel distribution features that allow the observation of BGP traffic changes in a straightforward manner. We used this tool to analyze 9 anomaly events through 27 points of observation and generated 81 datasets, with a total of 86,400 samples. The generated datasets and source code of the developed tool were made publicly available, for extension and reusability. Additionally, we analyzed which trends in BGP behavior can be used to distinguish regular traffic from anomalies and different types of anomalies.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call