The vast volume of redundant and irrelevant network traffic data poses significant hurdles for intrusion detection. Effective feature selection is crucial for eliminating irrelevant information. Presently, most filtering and embedded methods rely on fixed thresholds or ratios, necessitating prior knowledge. Conversely, wrapper methods are computationally intensive, and individual feature selection methods may introduce biases in evaluation. To address these challenges, this study introduces Adaptive Neighborhood based Feature Selection (AN-SFS), a dynamic feature selection approach that adapts to local statistical properties of the data. Unlike traditional methods, AN-SFS adjusts its threshold based on the characteristics of the current feature subset and incorporates statistical measures of neighboring features, capturing subtle relationships and dependencies. This adaptability enables AN-SFS to achieve robust and effective feature selection outcomes. Using NSL-KDD and UNSW-NB15 datasets, our model demonstrates superiority over conventional ML classifiers in detection rate, precision, and recall, achieving outstanding accuracy rates of 99.3% on NSL-KDD and 97.5% on UNSW-NB15, significantly outperforming contemporary methods. To further demonstrate the effectiveness of our feature selection approach, we conducted a series of comparisons with other statistical methods, including ANOVA, Pearson correlation, and the chi-square test. In each comparison, our approach consistently outperformed these alternatives, underscoring its superior efficacy.
Read full abstract