Abstract

Network telescopes or “Darknets” received unsolicited Internet-wide traffic, thus providing a unique window into macroscopic Internet activities associated with malware propagation, denial of service attacks, network reconnaissance, misconfigurations and network outages. Analysis of the resulting data can provide actionable insights to security analysts that can be used to prevent or mitigate cyber-threats. Large network telescopes, however, observe millions of nefarious scanning activities on a daily basis which makes the transformation of the captured information into meaningful threat intelligence challenging. To address this challenge, we present a novel framework for characterizing the structure and temporal evolution of scanning behaviors observed in network telescopes. The proposed framework includes four components. It (i) extracts a rich, high-dimensional representation of <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">scanning profiles</i> composed of features distilled from network telescope data; (ii) learns, in an unsupervised fashion, information-preserving <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">succinct representations</i> of these scanning behaviors using <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">deep representation learning</i> that is amenable to clustering; (iii) performs <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">clustering</i> of the scanner profiles in the resulting latent representation space on daily Darknet data, and (iv) <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">detects temporal changes</i> in scanning behavior using techniques from <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">optimal mass transport</i> . We robustly evaluate the proposed system using both synthetic data and real-world Darknet data. We demonstrate its ability to detect real-world, high-impact cybersecurity incidents such as the onset of the Mirai botnet in late 2016 and several interesting cluster formations in early 2022 (e.g., heavy scanners, evolved Mirai variants, Darknet “backscatter” activities, etc.). Comparisons with state-of-the-art methods showcase that the integration of the proposed features with the deep representation learning scheme leads to better classification performance of Darknet scanners.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call