Abstract

With the arrival of the era of the Internet of Things (IoT), the rapid development of new technologies such as artificial intelligence, big data, and other advanced techniques, data is growing geometrically. These data are prone to form data silos, scattered in various places. Jointly decentralized data applications will accelerate the progress and development of the times, but also face the challenge of data privacy protection. Federated learning (FL), a new branch of distributed machine learning, is trained by collaborative training to obtain global model without direct exposure to local datasets. Some studies have shown that typically federated learning involves a larger number of participants, it can lead to a significant increase in communication overhead, resulting in issues such as higher latency and bandwidth consumption. We suggest masking a subset of diverse participants and allowing the remaining participants to proceed with the next communication round of updates. Our aim is for reducing the communication overhead and improving the convergence performance of the global model on the premise of heterogeneous data privacy protection. We design a private masking approach PrivMaskFL to address two problems. We firstly propose a dynamic participant aggregation masking approach, which adopts the greedy ideology to select the relatively important participants and mask the unimportant ones; secondly, we design an adaptive differential privacy approach, which adaptively stratifies privacy budget according to the characteristics of participant, allocates the budget in a fine-grained stratified level, and adds Gaussian noise reasonably. Specifically, in each communication round, the participant’s model needs to perform local differential privacy noise addition for uplink parameter transmission; the server aggregates to acquire global model, finds a candidate participant subset based on the smaller parameter divergence by using the greedy algorithm approximation for t+1-th communication round for downlink parameter transmission. Subsequently, the privacy budget sequence is divided and granted to the participants of the stratified level, and the Gaussian noise addition of adaptive differential privacy is completed to achieve privacy protection without compromising the model’s usability. In experiments, our approach reduces the communication overhead and improve the convergence performance. Furthermore, our approach achieves higher accuracy and robust variance on both FMNIST and FEMNIST datasets.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call