Abstract
SummaryThis study addresses the challenge of data imbalance and class overlap in machine learning for intrusion detection, proposing that targeted algorithmic adjustments can significantly enhance model performance. Our hypothesis contends that an ensemble framework, adeptly integrating novel threshold‐adjustment algorithms, can improve classification sensitivity and specificity. To test this, we developed an ensemble model comprising Balanced Bagging (BB), eXtreme Gradient Boosting (XGBoost), and Random Forest (RF), fine‐tuned using grid search for BB and XGBoost, and augmented with the Hellinger metric for RF to tackle data imbalance. The innovation lies in our algorithms, which adeptly adjust the discrimination threshold to rectify the class overlap problem, enhancing the model's ability to discern between negative and positive classes. Utilizing the UNSW‐NB15 dataset, we conducted a comparative analysis for binary and multi‐category classification. Our ensemble model achieved a binary classification accuracy of 97.80%, with a sensitivity rate of 98.26% for detecting attacks, and a multi‐category classification accuracy and sensitivity that reached up to 99.73% and 97.24% for certain attack types. These results substantially surpass those of existing models on the same dataset, affirming our model's superiority in dealing with complex data distributions prevalent in network security domains.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: Concurrency and Computation: Practice and Experience
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.