Web Traffic Anomaly Detection Using Isolation Forest

Wilson Chua,Arsenn Lorette Diamond Pajas,Crizelle Shane Castro,Sean Patrick Panganiban,April Joy Pasuquin,Merwin Jan Purganan,Rica Malupeng,Divine Jessa Pingad,John Paul Orolfo,Haron Hakeen Lua,Lemuel Clark Velasco

doi:10.3390/informatics11040083

Abstract

As companies increasingly undergo digital transformation, the value of their data assets also rises, making them even more attractive targets for hackers. The large volume of weblogs warrants the use of advanced classification methodologies in order for cybersecurity specialists to identify web traffic anomalies. This study aims to implement Isolation Forest, an unsupervised machine learning methodology in the identification of anomalous and non-anomalous web traffic. The publicly available weblogs dataset from an e-commerce website underwent data preparation through a systematic pipeline of processes involving data ingestion, data type conversion, data cleaning, and normalization. This led to the addition of derived columns in the training set and manually labeled testing set that was then used to compare the anomaly detection performance of the Isolation Forest model with that of cybersecurity experts. The developed Isolation Forest model was implemented using the Python Scikit-learn library, and exhibited a superior Accuracy of 93%, Precision of 95%, Recall of 90% and F1-Score of 92%. By appropriate data preparation, model development, model implementation, and model evaluation, this study shows that Isolation Forest can be a viable solution for close to accurate web traffic anomaly detection.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Web Traffic Anomaly Detection Using Isolation Forest

Abstract

Talk to us

Similar Papers

More From: Informatics

Lead the way for us

Journal: Informatics	Publication Date: Nov 5, 2024
License type: CC BY 4.0

Similar Papers

X Bot Detection Using One-Class Classification Methods with Isolation Forest Algorithm
Yusup Miftahuddin ... Muhammad Haydar Al-Ghifary
International Journal on Advanced Science, Engineering and Information Technology | VOL. 14
Yusup Miftahuddin, et. al.Yusup Miftahuddin ... Muhammad Haydar Al-Ghifary
11 Aug 2024
International Journal on Advanced Science, Engineering and Information Technology | VOL. 14

Anomaly detection in multivariate time series of drilling data
Mehmet Cagri Altindal ... Rasool Khosravanian
Geoenergy Science and Engineering | VOL. 237
Mehmet Cagri Altindal, et. al.Mehmet Cagri Altindal ... Rasool Khosravanian
19 Mar 2024
Geoenergy Science and Engineering | VOL. 237

Pushing the limits of solubility prediction via quality-oriented data selection.
Murat Cihan Sorkun ... Süleyman Er
iScience | VOL. 24
Murat Cihan Sorkun, et. al.Murat Cihan Sorkun ... Süleyman Er
17 Dec 2020
iScience | VOL. 24

An Outlier Detection Approach on Credit Card Fraud Detection Using Machine Learning: A Comparative Analysis on Supervised and Unsupervised Learning
P Caroline Cynthia ... S Thomas George
-
P Caroline Cynthia, et. al.P Caroline Cynthia ... S Thomas George
26 Jul 2020
26 Jul 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Web Traffic Anomaly Detection Using Isolation Forest

Abstract

Talk to us

Similar Papers

More From: Informatics