<italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Federated Learning</i> (FL) is an emerging approach in edge computing for collaboratively training machine learning models among multiple devices, which aims to address limited bandwidth, system heterogeneity, and privacy issues in traditional centralized training. However, the existing federated learning methods focus on learning a shared global model for all devices, which may not always be ideal for different devices. Such situations become even worse when each edge device has its own data distribution or task. In this paper, we study personalized federated learning in which our goal is to train models to perform well for individual clients. We observe that the initialization in each communication round causes the forgetting of historical personalized knowledge. Based on this observation, we propose a novel <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Personalized Federated Learning</i> (PFL) framework via self-knowledge distillation, named pFedSD. By allowing clients to distill the knowledge of previous personalized models to current local models, pFedSD accelerates the process of recalling the personalized knowledge for the latest initialized clients. Moreover, self-knowledge distillation provides different views of data in feature space to realize an implicit ensemble of local models. Extensive experiments on various datasets and settings demonstrate the effectiveness and robustness of pFedSD.
Read full abstract