Abstract

In real-world Federated learning(FL), client training data may contain label noise, which can harm the generalization performance of the global model. Most existing noisy label learning methods rely on sample selection strategies that treat small-loss samples as correctly labeled ones, and large-loss samples as mislabeled ones. However, large-loss samples may also be valuable hard samples with correct labels. Also, these sample selection strategies have an issue in FL setting: Moreover sample selection strategies require training two models simultaneously and are executed in every mini-batch, which increases the communication cost and client computation. In this paper, we propose an efficient multi-stage federated learning framework to tackle label noise. Firstly, we use self-supervision pre-training to extract category-independent features without using any labels. Secondly, we propose a federated static two-dimensional sample selection(FedSTSS) method, which uses category probability entropy and loss as separation metrics to identify samples. Notably, to improve the accuracy of recognition samples, we add a weight entropy minimization term to the cross-entropy loss function. Finally, we use a semi-supervised method to finetune the global model on identified clean samples and mislabeled samples. Extensive experiments on multiple synthetic and real-world noisy datasets demonstrate that our method outperforms the state-of-the-art methods.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call