Design and evaluation of adaptive system for big data cyber security analytics

Faheem Ullah,M Ali Babar,Aldeida Aleti

doi:10.1016/j.eswa.2022.117948

Abstract

Big Data Cyber Security Analytics (BDCA) systems leverage big data technologies to collect, store, and analyze a large volume of security event data for detecting cyber-attacks. Big data analytical frameworks (e.g, Apache Hadoop and Apache Spark) are used for building BDCA systems. These frameworks come with dozens of configurable parameters that need to be tuned for each security dataset to use the BDCA system to the maximum of its ability. Manually tuning these many parameters for each dataset is a time-consuming and resource-intensive task. Therefore, tuning techniques are proposed to automatically tune framework parameters. However, previous tuning techniques not only require users to manually initiate and execute the tuning process but are also computationally expensive. In this paper, we present ADAPTER, a parameter-driven adaptation system that automatically triggers the tuning process, tunes the framework’s configuration parameters for different security datasets, and finally executes the BDCA system with the adapted configuration. ADAPTER uses empirically designed fuzzy rules that tune each configuration parameter based on the parameter’s impact on resource utilization such as CPU and disk utilization. We have evaluated ADAPTER for fully distributed Spark-based and Hadoop-based BDCA systems using five security datasets and several adaptation scenarios. Our evaluation shows that with respect to default settings, ADAPTER (i) reduces the training time by 35.9% and 31.1% for Spark and Hadoop, respectively (ii) reduces the testing time by 15.3% and 10.1% for Spark and Hadoop, respectively (iii) improves resource utilization by 36.6% for Spark and 55% for Hadoop (iii) reduces tuning time by 499 times and 135 times for Spark and Hadoop and (iv) improves data locality by 4%.

Full Text