Abstract

This paper will examine the concept of combining trusted computing technologies with the Apache Hadoop Distributed File System (HDFS) in an effort to address concerns of data confidentiality and integrity. We discuss a motivation and address a set of common security concerns within HDFS through infrastructure and software involving data-at-rest encryption and integrity validation. To accomplish these goals, we make use of technology from the Trusted Computing Group (TCG), such as the pervasively available Trusted Platform Module (TPM). In addition, we discuss our design considerations in building an encryption framework for Hadoop in a trustworthy manner, and results of our experiments creating an encryption scheme for Hadoop utilizing hardware key protections and AES-NI for encryption acceleration. As part of this design we examine the recently implemented crypto framework for Hadoop and independently test the performance claims of AES-NI to mitigate performance overhead.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call