Abstract
Due to the quick development of network technology, assaults have become more sophisticated and dangerous. Numerous strategies have been put out to target different types of attacks and conduct trials using various approaches. In order to maintain network integrity and ensure network security, intrusion detection systems, or IDSs, are necessary. In this work, we investigate the effects of several feature extraction methods on IDS performance. We analyze the performance of various feature extraction techniques on two well-known intrusion detection datasets, NSL-KDD and CICIDS2017. Two datasets are used to test these approaches. By lowering dimensionality, enhancing data quality, and enabling visualization, principal component analysis (PCA) is a useful preprocessing method. But it's crucial to take into account its drawbacks and use it in conjunction with other preprocessing methods as necessary. The results are classified using the Decision Tree (DT), Random Forest (RF), Extreme Gradient Boosting (XGBoost), and Naive Bayes algorithms. This study aims to compare the final intrusion detection accuracy of each model in order to assess the performance of these approaches and gain a better understanding of the robustness and generalizability of each strategy across different dataset features. The experimental findings showed that the RF method reached a maximum accuracy of 98.57% on the NSL-KDD dataset and 97.10% on the CICIDS2017 dataset when conventional preprocessing was applied. However, with an accuracy of 97.85%, the RF model proved to be the most dependable model when used on the NSL-KDD dataset with both standard and fusion preprocessing.With standard and fusion preprocessing, the RF model achieved the best accuracy of 98.56% in the instance of the CICIDS2017 dataset. The findings demonstrated that PCA-based fusion preprocessing is not always the best option.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have