In federated learning (FL), clients may have diverse objectives, and merging all clients' knowledge into one global model will cause negative transfer to local performance. Thus, clustered FL is proposed to group similar clients into clusters and maintain several global models. In the literature, centralized clustered FL algorithms require the assumption of the number of clusters and hence are not effective enough to explore the latent relationships among clients. In this paper, without assuming the number of clusters, we propose a peer-to-peer (P2P) FL algorithm named PANM. In PANM, clients communicate with peers to adaptively form an effective clustered topology. Specifically, we present two novel metrics for measuring client similarity and a two-stage neighbor matching algorithm based Monte Carlo method and Expectation Maximization under the Gaussian Mixture Model assumption. We have conducted theoretical analyses of PANM on the probability of neighbor estimation and the error gap to the clustered optimum. We have also implemented extensive experiments under both synthetic and real-world clustered heterogeneity. Theoretical analysis and empirical experiments show that the proposed algorithm is superior to the P2P FL counterparts, and it achieves better performance than the centralized cluster FL method. PANM is effective even under extremely low communication budgets.
Read full abstract