ClusterGrad: Adaptive Gradient Compression by Clustering in Federated Learning

Laizhong Cui,Lei Zhang,Yipeng Zhou,Xiaoxin Su

doi:10.1109/globecom42002.2020.9322527

Abstract

Recently, Federated Learning (FL) has drawn tremendous attentions due to its ability to protect client’s privacy. In FL, clients collaboratively train machine learning models by merely sharing intermediate computations, i.e., gradients of model parameters. However, training a complicated model involves multiple rounds of interactions between clients and the server via the Internet. Consequently, communication is a primary bottleneck of FL attributed to the poor network conditions and the large amount of interchanged computations. To overcome the communication bottleneck, we propose the ClusterGrad algorithm to compress gradients which can considerably reduce the volume of communicated computations. Our design is based on the fact that there is only a small fraction of gradients whose values are far away from the origin in each round of interaction in FL. We first identify these essential gradients that are far away from 0 using the K-means algorithm. These gradient values are approximated by a novel clustering based quantization algorithm. Then, the rest gradients lying close to 0 are approximated with a single value. We can prove that ClusterGrad outperforms the latest FL gradient compression algorithms: Probability Quantization (PQ) and Deep Gradient Compression (DGC). We conduct extensive experiments with the CIFAR-10 datasets which further demonstrate that ClusterGrad can achieve compression ratio (used interchangeably with compression rate) 123 on average in comparison with PQ and DGC with compression ratios 16 and 60 respectively.

Full Text