Abstract
In recent years, a number of platforms for building Big Data applications, both open-source and proprietary, have been proposed. One of the most popular platforms is Apache Hadoop, an open-source software framework for Big Data processing used by leading companies like Yahoo and Facebook. Historically, earlier versions of Hadoop did not prioritize security, so Hadoop has continued to make security modifications. In particular, the Hadoop Distributed File System (HDFS) upon which Hadoop modules are built did not provide robust security for user authentication. This paper proposes a token-based authentication scheme that protects sensitive data stored in HDFS against replay and impersonation attacks. The proposed scheme allows HDFS clients to be authenticated by the datanode via the block access token. Unlike most HDFS authentication protocols adopting public key exchange approaches, the proposed scheme uses the hash chain of keys. The proposed scheme has the performance (communication power, computing power and area efficiency) as good as that of existing HDFS systems.
Highlights
With the growth of social networks and smart devices, the use of Big Data has increased dramatically over the past few years
Open-source platforms for scalable and distributed processing of data are being actively studied in Cloud Computing in which dynamically scalable and often virtualized IT resources are provided as a service over the Internet [4]
This paper proposes a token-based authentication scheme that protects sensitive Hadoop Distributed File System (HDFS) data against replay and impersonation attacks
Summary
With the growth of social networks and smart devices, the use of Big Data has increased dramatically over the past few years. Delegation token approaches use symmetric encryption and the shared keys may be distributed to hundreds or even thousands of hosts depending upon the token type [15,16]. This leaves Hadoop communication vulnerable to eavesdropping and modification, making replay and impersonation attacks more likely. Hadoop security controls require the namenode and the datanode to share a private key to use the block access token. This paper proposes a token-based authentication scheme that protects sensitive HDFS data against replay and impersonation attacks. The proposed scheme allows clients to be authenticated to the datanode via the block access token.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have