Memory access pattern based insider threat detection in big data systems

Santosh Aditham,Nagarajan Ranganathan,Srinivas Katkoori

doi:10.1109/bigdata.2016.7841027

Abstract

Big data platforms like Hadoop and Spark are being widely adopted both by academia and industry. In this paper, we propose a runtime intrusion detection technique that understands and works according to the memory properties of such distributed compute platforms. The proposed method is based on runtime analysis of memory access patterns of tasks running on the slave nodes of a distributed compute cluster. First, every slave node of the cluster creates a behavior profile for each task it executes. A behavior profile includes information representing the sizes of private & shared memory accesses made by a task during execution. Then, each process behavior profile is shared with other replica nodes that are scheduled to execute the same task on their copy of the same data. Next, these replica nodes verify their local tasks with the help of the information embedded in the received behavior profiles. This step is realized by running Principal Component Analysis (PCA) on the memory access patterns. Finally, nodes share their observations for consensus and report a possible intrusion to the master node if they find any discrepancy. This is a position paper and hence the proposed solution was tested and proved to work in real-time while executing the terasort mapreduce example on a small hadoop cluster.

Full Text