Abstract

Federated learning (FL) enables multiple devices to collaboratively accomplish a machine learning task by iteratively exchanging their model updates instead of raw data with a parameter server (PS), thus protecting users' data privacy. Nevertheless, the communication efficiency and the nonindependent and identically distributed (non-IID) data are remaining intractable challenges in FL. This article proposes a hybrid FL framework called FedSeq, based on user clustering and sequential in-cluster training, to improve the communication efficiency and the test accuracy, especially on non-IID data. The FedSeq first divides users into multiple clusters and selects a cluster head (CH) on behalf on each cluster to upload the model updates to the PS, thus drastically reducing the uplink communication overhead. Within each cluster, a sequential training method is designed, which enables the CHs' models witness more categories of training data and go through more meta-updates per training epoch, thus promoting the test accuracy. Moreover, we also provide the convergence analysis of FedSeq on IID data with Random user clustering, with specific experiments validating our theoretical analysis from a simulation perspective. In the experiments, we compare our FedSeq with other baselines, including FedSGD, FedAVG, FedProx, FedCluster, and HierFAVG, of which the results demonstrate that FedSeq outperforms other FL paradigms in terms of model accuracy and training efficiency. We also test FedSeq with Random and Low-Energy Adaptive Clustering Hierarchy clustering strategies to demonstrate the robustness of FedSeq to different clustering strategies. The FedSeq offers a potential solution to cope with non-IID data and reduce uplink communication overhead in FL.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call