Cyberspace continues to host highly sophisticated malicious entities that have demonstrated their ability to launch debilitating, intimidating, and disrupting cyber attacks. Recently, such entities have been adopting orchestrated, often botmaster- coordinated, stealthy attack strategies aimed at maximizing their targets' coverage while minimizing redundancy and overlap. The latter entities, which are typically dubbed as bots within botnets, are ominously being leveraged to cause drastic Internet-wide and enterprise impacts by means of severe misdemeanors. While a plethora of literature approaches have devised operational cyber security techniques for the detection of such botnets, very few have tackled the problem of how to promptly and effectively takedown such botnets. In the past three years, we have received 12 GB of daily malicious real darknet data (i.e., Internet traffic destined to half a million routable but unallocated IP addresses or sensors) from more than 12 countries. This article exploits such data to propose a novel Internet-scale cyber security capability that fuses big data behavioral analytics in conjunction with formal graph theoretical concepts to infer and attribute Internet-scale infected bots in a prompt manner and identify the niche of the botnet for effective takedowns. We validate the accuracy of the proposed approach by employing 100 GB of the Carna botnet, which is a very recent real malicious Internet-scale botnet. Since performance is also an imperative metric when dealing with big data for network security, this article further provides a comparison between two trending big data processing architectures: the almost standard Apache Hadoop system, and a more traditional and simplistic multi-threaded programming approach, by employing 1 TB of real darknet data. Several recommendations and possible future research work derived from the previous experiments conclude this article.
Read full abstract