The growth of the internet of things (IoT) generates new processing, networking infrastructure, data storage, and management capabilities. This massive data volume may be used to provide high-value information for decision support, forecasting, business intelligence, data-intensive science research, etc. However, due to the nature of IoT in distribution, virtualization, cloud integration, internet connectivity, the IoT environment is prone to various cyber-attacks and security issues, including an inability to access data chunks at the in/outcoming stream like worms, malware, virtual machine escape, distributed denial of service attack (DDoS), etc. Hence, the increasing frequency and potency of recent attacks and the constantly evolving attack vectors necessitate the development of improved detection approaches. Therefore, this paper proposes a distributed computing-based security model to safeguard big data systems. The proposed ensemble multi binary attack model (EMBAM) is an Intrusion Detection System (IDS) that offers a unique anomaly-based IDS to detect normal behaviour and abnormal attack(s), e.g., threats in a network. The EMBAM ensemble multiple binary classifiers into a single model by stacking. The core binary model is a decision tree classifier with hyperparameters optimized using the grid search method. Multiple binary classifiers’ usage allows each binary classifier to adopt the limitations of the others in determining the intercorrelation of the available features and the attack to achieve a high detection rate and a low false alarm rate. The improvement suggestions and outcomes insights by the EMBAM confirm its superiority in inaccuracy, detection rate, and efficiency. Two datasets have been used to validate the experiments, i.e., UNSW-NB15 and CICIDS2017. Empirical analysis of the experimental profile of the EMBAM has been discussed with eight-plus state-of-the-art methods using performance metrics like accuracy, precision, detection rate, false alarm rate, and F1-score. The EMBAM can recognize multiple attack types as a star plug and play advantage in a highly dynamic scheme. The proposed approach outperforms existing approaches over the UNSW-NB15 dataset and yields competitive results over the CICIDS2017 dataset.
Read full abstract