Defending Batch-Level Label Inference and Replacement Attacks in Vertical Federated Learning

Tianyuan Zou,Wenhan Liu,Yuanqin He,Yang Liu,Qiang Yang,Ya-Qin Zhang,Zhihao Yi,Yan Kang

doi:10.1109/tbdata.2022.3192121

Abstract

In a vertical federated learning (VFL) scenario where features and models are split into different parties, it has been shown that sample-level gradient information can be exploited to deduce crucial label information that should be kept secret. An immediate defense strategy is to protect sample-level messages communicated with Homomorphic Encryption (HE), exposing only batch-averaged local gradients to each party. In this paper, we show that even with HE-protected communication, private labels can still be reconstructed with high accuracy by gradient inversion attack, contrary to the common belief that batch-averaged information is safe to share under encryption. We then show that backdoor attack can also be conducted by directly replacing encrypted communicated messages without decryption. To tackle these attacks, we propose a novel defense method, Confusional AutoEncoder (termed <b>CAE</b>), which is based on autoencoder and entropy regularization to disguise true labels. To further defend attackers with sufficient prior label knowledge, we introduce DiscreteSGD-enhanced CAE (termed <b>DCAE</b>), and show that DCAE significantly boosts the main task accuracy than other known methods when defending various label inference attacks.

Full Text