VFLR: An Efficient and Privacy-Preserving Vertical Federated Framework for Logistic Regression

Jiaqi Zhao,Linfeng Li,Hui Zhu,Rongxing Lu,Fengwei Wang,Hui Li,Ermei Wang

doi:10.1109/tcc.2023.3247870

Abstract

With the explosive growth of data volume and computing capability, federated learning, which involves constructing global models over multiple data islands, has demonstrated its advantages and vast prospects in the field of machine learning. However, due to commonly vertically partitioned data, coupled with privacy concerns about data leakage, there are still some challenging issues in traditional federated learning. To tackle these challenges, in this paper, we propose an efficient and privacy-preserving vertical federated learning framework for logistic regression, named VFLR, where multiple participants can collaboratively perform global model training and query over their vertically partitioned data. Specifically, we first design a data aggregation matrix construction algorithm, with which the vertically partitioned data can be aggregated for high-accuracy global model training. Then, by utilizing a novel symmetric homomorphic encryption, our framework can ensure that the whole training and query processes do not leak any private information. Moreover, based on the data aggregation matrix, multi-round interactions are not required in VFLR, improving training efficiency significantly. Detailed security analysis shows that VFLR can well protect data and model information from inference attacks. In addition, extensive experiments demonstrate that VFLR has high training and query accuracy and low computation and communication overhead.

Full Text