SEAR: Secure and Efficient Aggregation for Byzantine-Robust Federated Learning

Lingchen Zhao,Qi Li,Qian Wang,Bo Feng,Jianlin Jiang,Chao Shen

doi:10.1109/tdsc.2021.3093711

Abstract

Federated learning facilitates the collaborative training of a global model among distributed clients without sharing their training data. Secure aggregation, a new security primitive for federated learning, aims to preserve the confidentiality of both local models and training data. Unfortunately, existing secure aggregation solutions fail to defend against Byzantine failures that are common in distributed computing systems. In this work, we propose a new secure and efficient aggregation framework, SEAR, for Byzantine-robust federated learning. Relying on the trusted execution environment, i.e., Intel SGX, SEAR protects clients’ private models while enabling Byzantine resilience. Considering the limitation of the current Intel SGX's architecture (i.e., the limited trusted memory), we propose two data storage modes to efficiently implement aggregation algorithms efficiently in SGX. Moreover, to balance the efficiency and performance of aggregation, we propose a sampling-based method to efficiently detect Byzantine failures without degrading the global model's performance. We implement and evaluate SEAR in a LAN environment, and the experiment results show that SEAR is computationally efficient and robust to Byzantine adversaries. Compared to the previous practical secure aggregation framework, SEAR improves aggregation efficiency by 4-6 times while supporting Byzantine resilience at the same time.

Highlights

R ECENTLY federated learning is introduced to train a shared global model with massively distributed clients [1]
We propose SEAR, a new secure aggregation framework for Byzantine-robust federated learning, by using the trusted hardware Intel Software Guard Extensions (SGX) to provide the privacy guarantee and aggregation efficiency at the same time
Considering the limitation of SGX, we propose two data storage modes that help aggregate a large number of models inside the trusted execution environment

Summary

Introduction

R ECENTLY federated learning is introduced to train a shared global model with massively distributed clients [1]. The learning process consists of multiple training round. An aggregation server distributes the current global model to a set of randomly selected clients. The clients train the machine learning model locally and return the updated model to the server. To protect clients’ private information, the server by design has no visibility into clients’ training processes and local data. Trained models are revealed to the server. Recent works have shown that gradient information can be used to infer the private content about clients’ training data. The server can recover the distribution of the training data through a generative adversarial network (GAN) [2] or pixel-wise accurate im-

Objectives

Methods

Results

Conclusion