Abstract

The limitation of signature-based Intrusion Detection systems has given rise for the popularity of Machine learning (ML) approaches r for building such intrusion detection systems (IDSs). ML is a sub-filed of Artificial Intelligence that enables algorithms to learn from data and its applications have been widely accepted and used in many domains. To achieve a promising ML-based model that can identify attacks and intrusions in networks and the cyber space, different stages of machine learning approach like pre-processing, attribute selection, model building, hyper parameter tuning can be very important. CICIDS2017 intrusion dataset was used for all the experimentations. This study focuses on building cyber threat detection model based on the ensemble feature selection and classification method. Innovative approaches were used for the analysis and pre-processing of the dataset. Thereafter, XGboost algorithm was used for selecting relevant features from the default input attributes in each of the captures. Thereafter, the reduced features were employed in the identification of cyber intrusions. The average accuracy achieved in the 8 captures of the dataset is 98% while precision is 0.98. Also, recall is 0.98, f1-score is 0.98 while AUC ROC score is 0.99. The study concluded that XGBoost-based model was able to achieve promising results based on the proper dataset encoding, feature importance-based feature selection and tuning of the algorithm for intrusion identification.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.