IoT information theft prediction using ensemble feature selection

Joffrey L Leevy,Jared M Peterson,John Hancock,Taghi M Khoshgoftaar

doi:10.1186/s40537-021-00558-z

Joffrey L Leevy, Jared M Peterson + Show 2 more

Open Access

https://doi.org/10.1186/s40537-021-00558-z

Copy DOI

Journal: Journal of Big Data	Publication Date: Jan 6, 2022
Citations: 11	License type: open-access

Affiliation: Florida Atlantic University

Abstract

The recent years have seen a proliferation of Internet of Things (IoT) devices and an associated security risk from an increasing volume of malicious traffic worldwide. For this reason, datasets such as Bot-IoT were created to train machine learning classifiers to identify attack traffic in IoT networks. In this study, we build predictive models with Bot-IoT to detect attacks represented by dataset instances from the Information Theft category, as well as dataset instances from the data exfiltration and keylogging subcategories. Our contribution is centered on the evaluation of ensemble feature selection techniques (FSTs) on classification performance for these specific attack instances. A group or ensemble of FSTs will often perform better than the best individual technique. The classifiers that we use are a diverse set of four ensemble learners (Light GBM, CatBoost, XGBoost, and random forest (RF)) and four non-ensemble learners (logistic regression (LR), decision tree (DT), Naive Bayes (NB), and a multi-layer perceptron (MLP)). The metrics used for evaluating classification performance are area under the receiver operating characteristic curve (AUC) and Area Under the precision-recall curve (AUPRC). For the most part, we determined that our ensemble FSTs do not affect classification performance but are beneficial because feature reduction eases computational burden and provides insight through improved data visualization.

Highlights

The Internet of Things (IoT) is a network of physical objects with limited computing capability [1]
The Honestly Significant Difference (HSD) tests for the influence of hyperparameter tuning on results in terms of Area under the receiver operating characteristic curve (AUC) show that hyperparameter tuning yields better performance
For performance in terms of AUC, the 4 Agree and 5 Agree Feature selection technique (FST) are in category ‘ab’, while All Features is in category ‘a’

Summary

Introduction

The IoT is a network of physical objects with limited computing capability [1]. There has been rapid growth in the use of these smart devices, as well as an increasing security risk from malicious network traffic. One of the more recent datasets for network intrusion detection is Bot-IoT [2]. The Bot-IoT dataset contains instances of various attack categories: denial-of-service (DoS), distributed denial-of-service (DDoS), reconnaissance, and information theft. The processed full dataset was generated by the Argus network security tool [3] and is available from an online repository of several comma-separated values (CSV) files. Bot-IoT has 29 features and 73,370,443 instances.

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

IoT information theft prediction using ensemble feature selection

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of Big Data

Lead the way for us

Similar Papers

IoT Reconnaissance Attack Classification with Random Undersampling and Ensemble Feature Selection
Joffrey L Leevy ... John Hancock
-
Joffrey L Leevy, et. al.Joffrey L Leevy ... John Hancock
01 Dec 2021
01 Dec 2021

Detecting web attacks using random undersampling and ensemble learners
Richard Zuech ... Taghi M Khoshgoftaar
Journal of Big Data | VOL. 8
Richard Zuech, et. al.Richard Zuech ... Taghi M Khoshgoftaar
27 May 2021
Journal of Big Data | VOL. 8

Detecting Information Theft Attacks in the Bot-IoT Dataset
Joffrey L Leevy ... Taghi M Khoshgoftaar
-
Joffrey L Leevy, et. al.Joffrey L Leevy ... Taghi M Khoshgoftaar
01 Dec 2021
01 Dec 2021

Using Random Undersampling and Ensemble Feature Selection for IoT Attack Prediction
Joffrey L Leevy ... John Hancock
International Journal of Reliability, Quality and Safety Engineering | VOL. 31
Joffrey L Leevy, et. al.Joffrey L Leevy ... John Hancock
21 Nov 2023
International Journal of Reliability, Quality and Safety Engineering | VOL. 31

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

IoT information theft prediction using ensemble feature selection

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of Big Data