ABSTRACTThe increasing sophistication of cyber threats necessitates advancements in intrusion detection systems (IDS). This research introduces a novel IDS framework that integrates advanced machine learning (ML) techniques, including ensemble learning, transfer learning, and feature engineering, to enhance detection accuracy, adaptability, and interpretability. The ensemble learning component combines diverse classifiers (e.g., decision trees, k‐nearest neighbors, and logistic regression), leveraging their unique strengths to boost detection rates by 15%, achieving a final accuracy of 95%. Transfer learning is implemented by pre‐training models on related cybersecurity datasets (UNSW‐NB15, NSL‐KDD, and CICIDS2019) and fine‐tuning them on the merged dataset, effectively reducing training time by 30%. Feature engineering involves advanced techniques such as interaction features and statistical transformations, which improve the model's sensitivity and increase detection rates by 20%, especially for complex attack patterns. To enhance transparency and interpretability, this study incorporates Explainable AI (XAI) methods, specifically LIME (Local Interpretable Model‐agnostic Explanations) and SHAP (SHapley Additive exPlanations). LIME approximates the local behavior of the complex model by fitting interpretable models around individual predictions, providing insights into feature contributions for specific instances. On the other hand, SHAP offers a global explanation based on cooperative game theory, assigning Shapley values to each feature to quantify their overall importance. This dual approach allows security analysts to understand the importance of local decision‐making and global features, increasing trust in the IDS predictions and aiding in identifying false positives and complex threats like zero‐day attacks. The proposed framework demonstrates robust scalability and robustness, making it capable of processing large volumes of network traffic data with minimal latency. However, challenges such as computational overhead from ensemble and XAI methods and the need for extensive labeled datasets for training were identified. The IDS framework's modular design facilitates integration with existing Security Information and Event Management (SIEM) systems, enhancing its practical applicability in enterprise, cloud, and IoT environments. Future research directions include exploring deep learning‐based ensemble methods, real‐time adaptive learning algorithms, and applying the framework in other domains like fraud detection and healthcare. Additionally, enhancing user‐centric explanations and developing interactive interfaces for analysts could further improve the interpretability and effectiveness of the IDS.
Read full abstract