Web server load prediction and anomaly detection from hypertext transfer protocol logs

Lenka Benova,Ladislav Hudec

doi:10.11591/ijece.v13i5.pp5165-5178

Abstract

<p><span lang="EN-US">As network traffic increases and new intrusions occur, anomaly detection solutions based on machine learning are necessary to detect previously unknown intrusion patterns. Most of the developed models require a labelled dataset, which can be challenging owing to a shortage of publicly available datasets. These datasets are often too small to effectively train machine learning models, which further motivates the use of real unlabeled traffic. By using real traffic, it is possible to more accurately simulate the types of anomalies that might occur in a real-world network and improve the performance of the detection model. We present a method able to predict and categorize anomalies without the aid of a labelled dataset, demonstrating the model’s usability while also gathering a dataset from real noisy network traffic. The proposed long short-term memory (LTSM) based intrusion detection system was tested in a real-world setting of an antivirus company and was successful in detecting various intrusions using 5-minute windowing over both the predicted and real update curves thereby demonstrating its usefulness. Our contribution was the development of a robust model generally applicable to any hypertext transfer protocol (HTTP) traffic with almost real-time anomaly detection, while also outperforming earlier studies in terms of prediction accuracy.</span></p>

Full Text