Abstract

Federated learning (FL) has received widespread attention for decentralized training of deep learning models across devices while preserving privacy. Industrial big data in key applications like healthcare, smart manufacturing, autonomous driving, and robotics is inherently multi-source and heterogeneous. Recent studies have shown that the quality of the global FL model deteriorates in the presence of such non-IID data. To address this, we present a novel clustered FL framework called Federated Learning via Agglomerative Client Clustering (FLACC). FLACC greedily agglomerates clients or groups of clients based on their gradient updates while learning the global FL model. Once the clustering is complete, each cluster is separated into its own federation, allowing clients with similar underlying distributions to train together. In contrast with existing methods, FLACC does not require the number of clusters to be specified <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">a priori</i> , can handle client fractions, and is robust to hyperparameter tuning. We demonstrate the efficacy of this framework through extensive experiments on three benchmark FL datasets and an FL case study simulated using industrial mixed fault classification data. Qualitative clustering results show that FLACC accurately identifies clusters in the presence of various statistical heterogeneities in the client data. Quantitative results show that FLACC outperforms vanilla FL and state-of-the-art personalized and clustered FL methods, even when the underlying clustering structure is not apparent.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call