Abstract
In recent years, data-driven approaches have attracted the interest of the research community. Considering network monitoring, unsupervised machine learning solutions such as clustering are particularly appealing to let the network analysts observe patterns, and track the evolution of traffic over time. In this paper, we present a novel unsupervised methodology to automatically process and analyze batches of HTTP traffic, looking just at the URL structure. First, we describe IDBSCAN, Iterative-DBSCAN. We design it to obtain well-shaped clusters, and to simplify the choice of parameters — often a cumbersome step for the network analyst. Second, we show LENTA, Longitudinal Exploration for Network Traffic Analysis, which allows to automatically observe the evolution over time of traffic, naturally highlighting trends and pinpointing anomalies.We first evaluate IDBSCAN and LENTA on synthetic data to compare their performance against well-known algorithms. Then we apply them on a real case, facing the analysis of hundred thousands of URLs collected from a live network. Results show both the goodness of clusters produced by IDBSCAN and LENTA ability to highlight changes in traffic, facilitating the analyst job.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.