Abstract

In the real-world scenario, data often have a long-tailed distribution and training deep neural networks on such an imbalanced dataset has become a great challenge. The main problem caused by a long-tailed data distribution is that common classes will dominate the training results and achieve a very low accuracy on the rare classes. Recent work focuses on improving the network representation ability to overcome the long-tailed problem, while it always ignores adapting the network classifier to a long-tailed case, which will cause the “incompatibility” problem of network representation and network classifier. In this paper, we use knowledge distillation to solve the long-tailed data distribution problem and fully optimize the network representation and classifier simultaneously. We propose multiexperts knowledge distillation with class-balanced sampling to jointly learn high-quality network representation and classifier. Also, a channel activation-based knowledge distillation method is also proposed to improve the performance further. State-of-the-art performance on several large-scale long-tailed classification datasets shows the superior generalization of our method.

Highlights

  • Used datasets in the literature for CNN’s training, like CIFAR [1] and ImageNet [2], are usually artificially designed and rarely suffer from the data imbalance

  • In the open real world, the distribution of data categories is often long-tailed, in which the number of training samples per class varies significantly from thousands of images to few samples. In the scenarios such as railway traffic, mesothelioma diagnosis, and industrial fault detection [3, 4], we need to detect an unexpected object where the real samples for the category of unexpected object are usually hard to collect, which leads to a long-tailed data distribution. ere are many works [5, 6] proposed to solve such real-world classification problems

  • Authors in [7, 8] pointed out the problem that the data distribution will hardly influence the performance of deep neural network

Read more

Summary

Introduction

Used datasets in the literature for CNN’s training, like CIFAR [1] and ImageNet [2], are usually artificially designed and rarely suffer from the data imbalance. Ere are many works [5, 6] proposed to solve such real-world classification problems. In the scenarios such as railway traffic, mesothelioma diagnosis, and industrial fault detection [3, 4], we need to detect an unexpected object where the real samples for the category of unexpected object are usually hard to collect, which leads to a long-tailed data distribution. They do not provide a general solution to such a long-tailed distribution problem. When deep models are trained in such imbalanced scenarios, standard approaches usually fail to achieve satisfactory results, leading to a significant drop in performance

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call