DSFL: Dynamic Sparsification for Federated Learning

Ning Lu,Mahdi Beitollahi,Mingrui Liu

doi:10.1109/iccspa55860.2022.10019204

Abstract

Federated Learning (FL) is considered the key, enabling approach for privacy-preserving, distributed machine learning (ML) systems. FL requires the periodic transmission of ML models from users to the server. Therefore, communication via resource-constrained networks is currently a fundamental bottleneck in FL, which is restricting the ML model complexity and user participation. One of the notable trends to reduce the communication cost of FL systems is gradient compression, in which techniques in the form of sparsification are utilized. However, these methods utilize a single compression rate for all users and do not consider communication heterogeneity in a real-world FL system. Therefore, these methods are bottlenecked by the worst communication capacity across users. Further, sparsification methods are non-adaptive and do not utilize the redundant, similar information across users' ML models for compression. In this paper, we introduce a novel Dynamic Sparsification for Federated Learning (DSFL) approach that enables users to compress their local models based on their communication capacity at each iteration by using two novel sparsification methods: layer-wise similarity sparsification (LSS) and extended top- <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$K$</tex> sparsification. LSS enables DSFL to utilize the global redundant information in users' models by using the Centralized Kernel Alignment (CKA) similarity for sparsification. The extended top- <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$K$</tex> model sparsification method empowers DSFL to accommodate the heterogeneous communication capacity of user devices by allowing different values of sparsification rate <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$K$</tex> for each user at each iteration. Our extensive experimental results <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">1</sup> <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">1</sup> All code and experiments are publicly available at: https://github.com/mahdibeit/DSFL. on three datasets show that DSFL has a faster convergence rate than fixed sparsification, and as the communication heterogeneity increases, this gap increases. Further, our thorough experimental investigations uncover the similarities of user models across the FL system.

Full Text