A Machine Learning-Based Framework with Enhanced Feature Selection and Resampling for Improved Intrusion Detection

Fazila Malik,Qazi Waqas Khan,Rana Alnashwan,Atif Rizwan,Ghada Atteia

doi:10.3390/math12121799

Abstract

Intrusion Detection Systems (IDSs) play a crucial role in safeguarding network infrastructures from cyber threats and ensuring the integrity of highly sensitive data. Conventional IDS technologies, although successful in achieving high levels of accuracy, frequently encounter substantial model bias. This bias is primarily caused by imbalances in the data and the lack of relevance of certain features. This study aims to tackle these challenges by proposing an advanced machine learning (ML) based IDS that minimizes misclassification errors and corrects model bias. As a result, the predictive accuracy and generalizability of the IDS are significantly improved. The proposed system employs advanced feature selection techniques, such as Recursive Feature Elimination (RFE), sequential feature selection (SFS), and statistical feature selection, to refine the input feature set and minimize the impact of non-predictive attributes. In addition, this work incorporates data resampling methods such as Synthetic Minority Oversampling Technique and Edited Nearest Neighbor (SMOTE_ENN), Adaptive Synthetic Sampling (ADASYN), and Synthetic Minority Oversampling Technique–Tomek Links (SMOTE_Tomek) to address class imbalance and improve the accuracy of the model. The experimental results indicate that our proposed model, especially when utilizing the random forest (RF) algorithm, surpasses existing models regarding accuracy, precision, recall, and F Score across different data resampling methods. Using the ADASYN resampling method, the RF model achieves an accuracy of 99.9985% for botnet attacks and 99.9777% for Man-in-the-Middle (MITM) attacks, demonstrating the effectiveness of our approach in dealing with imbalanced data distributions. This research not only improves the abilities of IDS to identify botnet and MITM attacks but also provides a scalable and efficient solution that can be used in other areas where data imbalance is a recurring problem. This work has implications beyond IDS, offering valuable insights into using ML techniques in complex real-world scenarios.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A Machine Learning-Based Framework with Enhanced Feature Selection and Resampling for Improved Intrusion Detection

Abstract

Talk to us

Similar Papers

More From: Mathematics

Lead the way for us

Journal: Mathematics	Publication Date: Jun 9, 2024
License type: CC BY 4.0

Similar Papers

A Rebalancing Framework for Classification of Imbalanced Medical Appointment No-show Data
Ulagapriya Krishnan ... Pushpa Sangar
Journal of Data and Information Science | VOL. 6
Ulagapriya Krishnan, et. al.Ulagapriya Krishnan ... Pushpa Sangar
27 Jan 2021
Journal of Data and Information Science | VOL. 6

Impact of Data Balancing and Feature Selection on Machine Learning-based Network Intrusion Detection
Azhari Shouni Barkah ... Rizki Wahyudi
JOIV : International Journal on Informatics Visualization | VOL. 7
Azhari Shouni Barkah, et. al.Azhari Shouni Barkah ... Rizki Wahyudi
28 Feb 2023
JOIV : International Journal on Informatics Visualization | VOL. 7

Enhanced Intrusion Detection with LSTM-Based Model, Feature Selection, and SMOTE for Imbalanced Data
Hussein Ridha Sayegh ... Ali Mansour Al-Madani
Applied Sciences | VOL. 14
Hussein Ridha Sayegh, et. al.Hussein Ridha Sayegh ... Ali Mansour Al-Madani
05 Jan 2024
Applied Sciences | VOL. 14

Applying machine learning methods to predict geology using soil sample geochemistry
Timothy C.C Lui ... Sharon A Cowling
Applied Computing and Geosciences | VOL. 16
Timothy C.C Lui, et. al.Timothy C.C Lui ... Sharon A Cowling
11 Aug 2022
Applied Computing and Geosciences | VOL. 16

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Machine Learning-Based Framework with Enhanced Feature Selection and Resampling for Improved Intrusion Detection

Abstract

Talk to us

Similar Papers

More From: Mathematics