Abstract
In the real-world scenario, data often have a long-tailed distribution and training deep neural networks on such an imbalanced dataset has become a great challenge. The main problem caused by a long-tailed data distribution is that common classes will dominate the training results and achieve a very low accuracy on the rare classes. Recent work focuses on improving the network representation ability to overcome the long-tailed problem, while it always ignores adapting the network classifier to a long-tailed case, which will cause the “incompatibility” problem of network representation and network classifier. In this paper, we use knowledge distillation to solve the long-tailed data distribution problem and fully optimize the network representation and classifier simultaneously. We propose multiexperts knowledge distillation with class-balanced sampling to jointly learn high-quality network representation and classifier. Also, a channel activation-based knowledge distillation method is also proposed to improve the performance further. State-of-the-art performance on several large-scale long-tailed classification datasets shows the superior generalization of our method.
Highlights
Used datasets in the literature for CNN’s training, like CIFAR [1] and ImageNet [2], are usually artificially designed and rarely suffer from the data imbalance
In the open real world, the distribution of data categories is often long-tailed, in which the number of training samples per class varies significantly from thousands of images to few samples. In the scenarios such as railway traffic, mesothelioma diagnosis, and industrial fault detection [3, 4], we need to detect an unexpected object where the real samples for the category of unexpected object are usually hard to collect, which leads to a long-tailed data distribution. ere are many works [5, 6] proposed to solve such real-world classification problems
Authors in [7, 8] pointed out the problem that the data distribution will hardly influence the performance of deep neural network
Summary
Used datasets in the literature for CNN’s training, like CIFAR [1] and ImageNet [2], are usually artificially designed and rarely suffer from the data imbalance. Ere are many works [5, 6] proposed to solve such real-world classification problems. In the scenarios such as railway traffic, mesothelioma diagnosis, and industrial fault detection [3, 4], we need to detect an unexpected object where the real samples for the category of unexpected object are usually hard to collect, which leads to a long-tailed data distribution. They do not provide a general solution to such a long-tailed distribution problem. When deep models are trained in such imbalanced scenarios, standard approaches usually fail to achieve satisfactory results, leading to a significant drop in performance
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.