SecureTLM: Private inference for transformer-based large model with MPC

Yuntian Chen,Xianjia Meng,Zhiying Shi,Zhiyuan Ning,Jingzhi Lin

doi:10.1016/j.ins.2024.120429

Abstract

Transformer-based Large Models (TLM), such as generative pre-trained models (GPT), have become increasingly popular for practical applications through Deep Learning as a Service (DLaaS). They have been extensively used in natural language processing and computer vision. However, concerns regarding potential private data leakage arise with this type of inference service. While some private inference techniques can protect privacy, they often introduce high latency and approximate replacements in the design protocols, resulting in changes to the model structure and decreased accuracy. In this research, we present SecureTLM, a private inference method based on secure multi-party computation (MPC) that does not require modifications to the underlying model structure. SecureTLM offers protocols for crucial computations in TLM, such as Multiplication, Softmax, GeLU, and LayerNorm, without altering the model structure. Experimental results demonstrate that SecureTLM ensures data privacy, maintains correctness, and achieves efficiency in private inference tasks.

Full Text