Resisting membership inference attacks through knowledge distillation

Junxiang Zheng,Yongzhi Cao,Hanpin Wang

doi:10.1016/j.neucom.2021.04.082

Abstract

Recently, membership inference attacks (MIAs) against machine learning models have been proposed. Using MIAs, adversaries can inference whether a data record is in the training set of the target model. Defense methods which use differential privacy mechanisms or adversarial training cannot handle the trade-off between privacy and utility well. Other methods based on knowledge transfer to improve model utility need public unlabeled data in the same distribution as private data, and this requirement may not be satisfied in some scenarios. To handle the trade-off between privacy and utility better, we propose two algorithms of deep learning, i.e., complementary knowledge distillation (CKD) and pseudo complementary knowledge distillation (PCKD). In CKD, the transfer data of knowledge distillation all come from the private training set, but their soft targets are generated from the teacher model which is trained using their complementary set. With similar idea, we propose PCKD which reduces the training set of each teacher model and uses model averaging to generate soft targets of transfer data. Because smaller training set leads to less utility, PCKD utilizes pre-training to improve the utility of teacher models. Experimental results on widely used datasets show that CKD and PCKD can both averagely decrease attack accuracy by nearly 25% with negligible utility loss. The training time of PCKD is nearly 40% lower than that of CKD. Compared with existing defense methods such as DMP, adversarial regularization, dropout, and DP-SGD, CKD and PCKD have great advantages on handling the trade-off between privacy and utility.

Full Text