In this study, we present a novel machine learning framework for web server anomaly detection that uniquely combines the Isolation Forest algorithm with expert evaluation, focusing on individual user activities within NGINX server logs. Our approach addresses the limitations of traditional methods by effectively isolating and analyzing subtle anomalies in vast datasets. Initially, the Isolation Forest algorithm was applied to extensive NGINX server logs, successfully identifying outlier user behaviors that conventional methods often overlook. We then employed DBSCAN for detailed clustering of these anomalies, categorizing them based on user request times and types. A key innovation of our methodology is the incorporation of post-clustering expert analysis. Cybersecurity professionals evaluated the identified clusters, adding a crucial layer of qualitative assessment. This enabled the accurate distinction between benign and potentially harmful activities, leading to targeted responses such as access restrictions or web server configuration adjustments. Our approach demonstrates a significant advancement in network security, offering a more refined understanding of user behavior. By integrating algorithmic precision with expert insights, we provide a comprehensive and nuanced strategy for enhancing cybersecurity measures. This study not only advances anomaly detection techniques but also emphasizes the critical need for a multifaceted approach in protecting web server infrastructures.
Read full abstract