Privacy Protection Optimization for Federated Software Defect Prediction via Benchmark Analysis

Ying Liu Ying Liu,Yong Li Ying Liu,Ming Wen Yong Li,Wenjing Zhang Ming Wen

doi:10.53106/160792642023112406001

Abstract

<p>Federated learning is a privacy-preserving machine learning technique that coordinates multi-participant co-modeling. It can alleviate the privacy issues of software defect prediction, which is an important technical way to ensure software quality. In this work, we implement Federated Software Defect Prediction (FedSDP) and optimize its privacy issues while guaranteeing performance. We first construct a new benchmark to study the performance and privacy of Federated Software defect prediction. The benchmark consists of (1) 12 NASA software defect datasets, which are all real software defect datasets from different projects in different domains, (2) Horizontal federated learning scenarios, and (3) the Federated Software Defect Prediction algorithm (FedSDP). Benchmark analysis shows that FedSDP provides additional privacy protection and security with guaranteed model performance compared to local training. It also reveals that FedSDP introduces a large amount of model parameter computation and exchange during the training process. There are model user threats and attack challenges from unreliable participants. To provide more reliable privacy protection without losing prediction performance we proposed optimization methods that use homomorphic encryption model parameters to resist honest but curious participants. Experimental results show that our approach achieves more reliable privacy protection with excellent performance on all datasets.</p> <p>&nbsp;</p>

Full Text