Content sensitivity based access control framework for Hadoop

T.K Ashwin Kumar,Hong Liu,Johnson P Thomas,Xiaofeh Hou

doi:10.1016/j.dcan.2017.07.007

T.K Ashwin Kumar, Hong Liu + Show 2 more

Open Access

https://doi.org/10.1016/j.dcan.2017.07.007

Copy DOI

Journal: Digital Communications and Networks	Publication Date: Aug 2, 2017
Citations: 10	License type: cc-by-nc-nd

Affiliation: Oklahoma State University

Abstract

Big data technologies have seen tremendous growth in recent years. They are widely used in both industry and academia. In spite of such exponential growth, these technologies lack adequate measures to protect data from misuse/abuse. Corporations that collect data from multiple sources are at risk of liabilities due to the exposure of sensitive information. In the current implementation of Hadoop, only file-level access control is feasible. Providing users with the ability to access data based on the attributes in a dataset or the user’s role is complicated because of the sheer volume and multiple formats (structured, unstructured and semi-structured) of data. In this paper, we propose an access control framework, which enforces access control policies dynamically based on the sensitivity of the data. This framework enforces access control policies by harnessing the data context, usage patterns and information sensitivity. Information sensitivity changes over time with the addition and removal of datasets, which can lead to modifications in access control decisions. The proposed framework accommodates these changes. The proposed framework is automated to a large extent as the data itself determines the sensitivity with minimal user intervention. Our experimental results show that the proposed framework is capable of enforcing access control policies on non-multimedia datasets with minimal overhead.

Full Text