Abstract

A privacy-preserving data analytics system enables a cloud user to perform the distributed job in a secure manner such that the data privacy can be guaranteed during the cloud-outsourced computation. However, many SGX-based solutions are vulnerable to some side-channel attacks, including the access pattern leakage from both network and memory. Several data-oblivious algorithms with full obliviousness have been proposed in the literature, but they are impractical to be used in the cloud due to the expensive computational overhead. In this article, we propose a DPSpark system with the security defined in a notion of <inline-formula><tex-math notation="LaTeX">$(\epsilon,\delta)$</tex-math></inline-formula> -differentially private obliviousness ( <inline-formula><tex-math notation="LaTeX">$(\epsilon,\delta)$</tex-math></inline-formula> -DPO), which relaxes full obliviousness to enable an efficiency improvement. Based on this definition, we present a perturbation-shuffle-analysis (PSA) computing architecture and design several typical differentially oblivious operators. In further, we optimize the system efficiency by reducing the number of oblivious shuffles and choosing an appropriate privacy budget. Finally, we benchmark the system in different parameters. The experimental results show that DPSpark significantly outperforms two state-of-the-art solutions, only with 10.1-85.4 percent additional overhead performing an SGX-based data analysis application.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call