Abstract

Big data platforms like Hadoop and Spark are being widely adopted both by academia and industry. In this paper, we propose a runtime intrusion detection technique that understands and works according to the memory properties of such distributed compute platforms. The proposed method is based on runtime analysis of memory access patterns of tasks running on the slave nodes of a distributed compute cluster. First, every slave node of the cluster creates a behavior profile for each task it executes. A behavior profile includes information representing the sizes of private & shared memory accesses made by a task during execution. Then, each process behavior profile is shared with other replica nodes that are scheduled to execute the same task on their copy of the same data. Next, these replica nodes verify their local tasks with the help of the information embedded in the received behavior profiles. This step is realized by running Principal Component Analysis (PCA) on the memory access patterns. Finally, nodes share their observations for consensus and report a possible intrusion to the master node if they find any discrepancy. This is a position paper and hence the proposed solution was tested and proved to work in real-time while executing the terasort mapreduce example on a small hadoop cluster.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.