Abstract

The heterogeneity of the data distribution generally influences federated learning performance in neural networks. For a well-performing global model, taking a weighted average of the local models, as in most existing federated learning algorithms, may not guarantee consistency with local models in the space of neural network maps. In this paper, we highlight the significance of the space of neural network maps to relieve the performance decay produced by data heterogeneity and propose a novel federated learning framework equipped with the decentralized knowledge distillation process (FedDKD). In FedDKD, we introduce a decentralized knowledge distillation (DKD) module to distill the knowledge of local models to teach the global model approaching the neural network map average by optimizing the divergence defined in the loss function, other than only averaging parameters as in the literature. Numerical experiments on various heterogeneous datasets reveal that FedDKD outperforms the state-of-the-art methods, especially on some extremely heterogeneous datasets.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call