Cheetah: Optimizing and Accelerating Homomorphic Encryption for Private Inference

Brandon Reagen,Gu-Yeon Wei,Hsien-Hsin S Lee,Yeongil Ko,Woo-Seok Choi,Vincent T Lee,David Brooks

doi:10.1109/hpca51647.2021.00013

Brandon Reagen, Gu-Yeon Wei + Show 5 more

Open Access

https://doi.org/10.1109/hpca51647.2021.00013

Copy DOI

Abstract

As the application of deep learning continues to grow, so does the amount of data used to make predictions. While traditionally big-data deep learning was constrained by computing performance and off-chip memory bandwidth, a new constraint has emerged: privacy. One solution is homomorphic encryption (HE). Applying HE to the client-cloud model allows cloud services to perform inferences directly on clients’ encrypted data. While HE can meet privacy constraints it introduces enormous computational challenges and remains impractically slow on current systems.This paper introduces Cheetah, a set of algorithmic and hardware optimizations for server-side HE DNN inference. Cheetah proposes HE-parameter tuning and operator scheduling optimizations, which together deliver up to $79 \times$ speedup over the state-of-the-art. However, HE inference still falls short of real-time inference speeds by nearly four orders of magnitude. Cheetah further proposes an accelerator architecture to understand the degree of speedup hardware can provide and whether it can bridge HE’s real-time performance gap. We evaluate several DNNs and find that privacy-preserving HE inference for ResNet50 can approach real-time speeds with a 587mm <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sup> accelerator dissipating 30W in 5nm.

Full Text