Abstract

Federated Distillation (FD) extends classic Federated Learning (FL) to a more general training framework that enables model-heterogeneous collaborative learning by Knowledge Distillation (KD) across multiple clients and the server. However, existing KD-based algorithms usually require a set of shared input samples for each client to produce soft-prediction for distillation. Worse still, such a manual selection is accompanied by careful deliberations or prior information on clients' private data distribution, which is not in line with the privacy-preserving characteristic of classic FL. In this paper, we propose a novel training framework to achieve data-independent knowledge transfer by properly designing a distributed generative adversarial network (GAN) between the server and clients that can synthesize shared feature representations to facilitate the FD training. Specifically, we deploy a generator on the server and reuse each local model as a federated discriminator to form a lightweight efficient distributed GAN that can automatically synthesize simulated global feature representations for distillation. Moreover, since the synthesized feature representations are usually more faithful and homologous with global data distribution, faster and better training convergence can be obtained. Extensive experiments on different tasks and heterogeneous models demonstrate the effectiveness of the proposed framework on model accuracy and communication overhead.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call