Robust Learning From Noisy Web Images Via Data Purification for Fine-Grained Recognition

Chuanyi Zhang,Zhenmin Tang,Qiong Wang,Qi Wu,Guosen Xie,Fumin Shen

doi:10.1109/tmm.2021.3134156

Abstract

Manually labeling fine-grained datasetsis laborious and typically requires domain-specific expert knowledge. Conversely, a vast amount of web data is relatively easy to obtain with nearly no human effort. Therefore, learning from noisy web data for fine-grained tasks is attracting increasing attention in recent years. However, the presence of noise in web images is a huge obstacle for training robust fine-grained recognition models. To this end, we propose a novel approach to identify noisy images as well as specifically distinguish in- and out-of-distribution samples. It can purify the noisy web training set by discarding out-of-distribution noise and relabeling in-distribution noisy samples. Then we can train the model on the purified dataset to alleviate the harmful effects of noise and make the most of web images to achieve better performance. Extensive experiments on three commonly used fine-grained datasets demonstrate that our approach is far superior to current state-of-the-art web-supervised methods. The data and source code of this work have been made publicly available at: <uri xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">https://github.com/NUST-Machine-Intelligence-Laboratory/Dataset-Purification</uri> .

Full Text