A new feature popularity framework for detecting cyberattacks using popular features

Richard Zuech,John Hancock,Taghi M Khoshgoftaar

doi:10.1186/s40537-022-00661-9

Richard Zuech, John Hancock + Show 1 more

https://doi.org/10.1186/s40537-022-00661-9

Copy DOI

Journal: Journal of Big Data	Publication Date: Dec 15, 2022
Citations: 2	License type: open-access

Affiliation: Florida Atlantic University

Abstract

We propose a novel feature popularity framework, and introduce this new framework to the cybersecurity domain. Feature popularity has not yet been used in machine learning or data mining, and we implement it with three web attacks from the CSE-CIC-IDS2018 dataset: Brute Force, SQL Injection, and XSS web attacks. Feature popularity is based upon ensemble Feature Selection Techniques (FSTs) and allows us to more easily understand common and important features between different cyberattacks. Three filter-based and four supervised learning-based FSTs are used to generate feature subsets for each of our three different web attack datasets, and then our feature popularity frameworks are applied. Classification performance for feature popularity is mostly similar as compared to when “all features” are evaluated (with feature popularity subsets having better performance in 5 out of 15 experiments). Our feature popularity technique effectively builds an ensemble of ensembles by first building an ensemble of FSTs for each dataset, and then building another ensemble across a dataset agreement dimension. The Jaccard similarity is also employed with our feature popularity framework in order to better identify which attack classes should (or should not) be grouped together when applying feature popularity. The four most popular features across all three web attacks from this experiment are: Flow_Bytes_s, Flow_IAT_Max, Fwd_IAT_Std, and Fwd_IAT_Total. When only using these four features as input to our models, classification performance is not seriously degraded. This feature popularity framework granted us new and previously unseen insights into the web attack detection process with CSE-CIC-IDS2018 big data, even though we had intensely studied it previously. We realized these four particular features cannot properly identify our three web attacks, as they operate mainly from the time dimension and NetFlow features from layers 3 and 4 of the OSI model. Conversely, our three web attacks operate in the application layer (7) of the OSI model and should not leave signatures in these four features. Feature popularity produces easier to explain models which provide domain experts better visibility into the problem, and can also reduce the complexity of implementing models in real-world systems.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A new feature popularity framework for detecting cyberattacks using popular features

Abstract

Talk to us

Similar Papers

More From: Journal of Big Data

Lead the way for us

Similar Papers

Feature Popularity Between Different Web Attacks with Supervised Feature Selection Rankers
Richard Zuech ... John Hancock
-
Richard Zuech, et. al.Richard Zuech ... John Hancock
01 Dec 2021
01 Dec 2021

Intrusion detection in cyber–physical environment using hybrid Naïve Bayes—Decision table and multi-objective evolutionary feature selection
Ranjit Panigrahi ... Waleed Alnumay
Computer Communications | VOL. 188
Ranjit Panigrahi, et. al.Ranjit Panigrahi ... Waleed Alnumay
17 Mar 2022
Computer Communications | VOL. 188

Analysis of Tree-Based Classifiers for Web Attack Detection
Deshmukh Surbhi ... Kshirsagar Deepak
-
Deshmukh Surbhi, et. al.Deshmukh Surbhi ... Kshirsagar Deepak
01 Jan 2020
01 Jan 2020

Web Server Hacking
-
-
--
01 Jan 2019
01 Jan 2019

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A new feature popularity framework for detecting cyberattacks using popular features

Abstract

Talk to us

Similar Papers

More From: Journal of Big Data