Abstract

As big data processing in the cloud becomes prevalent today, data privacy on such public platforms raises critical concerns. Hardware-based trusted execution environments (TEEs) provide promising and practical platforms for low-cost privacy-preserving data processing. However, using TEEs to enhance the security of data analytics frameworks like Apache Spark involves challenging issues when separating various framework components into trusted and untrusted domains, demanding meticulous considerations for programmability, performance, and security. Based on Intel SGX, we build Flare, a fast, secure, and memory-efficient data analytics framework with a familiar user programming interface and useful functionalities similar to Apache Spark. Flare ensures confidentiality and integrity by keeping sensitive data and computations encrypted and authenticated. It also supports oblivious processing to protect against access pattern side channels. The main innovations of Flare include a novel abstraction paradigm of shadow operators and shadow tasks to minimize trusted components and reduce domain switch overheads, memory-efficient data processing with proper granularities for different operators, and adaptive parallelization based on memory allocation intensity for better scalability. Flare outperforms the state-of-the-art secure framework by 3.0× to 176.1×, and is also 2.8× to 28.3× faster than a monolithic libOS-based integration approach.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call