Abstract
The Internet infrastructure relies on the Border Gateway Protocol (BGP) to provide essential routing information where abnormal routing behavior impairs global Internet connectivity and stability. Hence, employing anomaly detection algorithms is important for improving the performance of BGP routing protocol. In this paper, we propose two algorithms; the first is the guide feature generator (GFG), which generates guide features from traditional features in BGP time-series data using moving regression in combination with smoothed moving average. The second is a modified random forest feature selection algorithm which is employed to automatically select the most dominant features (ASMDF). Our mechanism shows that the detected anomalies are more realistic and the selected features are generally consistent across time series. Experimental evaluations using multiple machine learning models reveal that the proposed algorithms achieve up to 32.36 % improvement in accuracy rate, up to 35.44 % reduction in false negative rate, and up to 43.99 % reduction in false positive rate compared to not using these algorithms. Moreover, the ASMDF option increases the feature selection speed more than 3 times compared to most existing feature selection algorithms.
Highlights
The Border Gateway Protocol (BGP) is used to exchange routing information between border routers in a network comprising many autonomous systems
The small groups of successive vector sets produced by the first stage are passed to the proposed algorithm to automatically select the most dominant features (ASMDF) to select the most dominant features for each abnormal vector set only, where each vector set has different dominant features. We propose this modified feature selection algorithm for three reasons: to adapt with incoming series data, to directly pass each abnormal vector set with its dominant features to the third stage, and to stop executing the third stage for unmarked normal vector sets where the third stage is any selected machine learning technique, which receives each vector set with its dominant features from the second stage, for performing the classification process
To speed up the feature selection computations, we suggest two improvements: the first is to generate a small group of successive vector sets by the guide feature generator (GFG) algorithm and the second is to use orientation rather than magnitude by applying the cosine similarity equation to the vector set to produce a smaller cosine similarity matrix that generates fewer random forest trees compared to the corresponding vector set
Summary
The Border Gateway Protocol (BGP) is used to exchange routing information between border routers in a network comprising many autonomous systems. Many techniques have been employed to detect BGP anomalies [1] These existing anomaly detection methods select the traffic features of the present to make the decision regardless of the time series of the traffic data, where time-series analysis can bring extra important information in identifying state changes. We propose this modified feature selection algorithm for three reasons: to adapt with incoming series data, to directly pass each abnormal vector set with its dominant features to the third stage, and to stop executing the third stage for unmarked normal vector sets where the third stage is any selected machine learning technique, which receives each vector set with its dominant features from the second stage, for performing the classification process This leads to decreasing processing time and computations. Traffic data are multivariant time series and the anomaly patterns vary gradually with the temporal information For this reason, we propose our work
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: TURKISH JOURNAL OF ELECTRICAL ENGINEERING & COMPUTER SCIENCES
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.