Abstract

Manually labeling fine-grained datasetsis laborious and typically requires domain-specific expert knowledge. Conversely, a vast amount of web data is relatively easy to obtain with nearly no human effort. Therefore, learning from noisy web data for fine-grained tasks is attracting increasing attention in recent years. However, the presence of noise in web images is a huge obstacle for training robust fine-grained recognition models. To this end, we propose a novel approach to identify noisy images as well as specifically distinguish in- and out-of-distribution samples. It can purify the noisy web training set by discarding out-of-distribution noise and relabeling in-distribution noisy samples. Then we can train the model on the purified dataset to alleviate the harmful effects of noise and make the most of web images to achieve better performance. Extensive experiments on three commonly used fine-grained datasets demonstrate that our approach is far superior to current state-of-the-art web-supervised methods. The data and source code of this work have been made publicly available at: <uri xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">https://github.com/NUST-Machine-Intelligence-Laboratory/Dataset-Purification</uri> .

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call