IOFM: Using the Interpolation Technique on the Over-Fitted Models to Identify Clean-Annotated Samples

Yongchan Choi,Dongha Kim,Yongdai Kim,Ilsang Ohn,Kunwoong Kim

doi:10.1609/aaai.v38i12.29209

Abstract

Most recent state-of-the-art algorithms for handling noisy label problems are based on the memorization effect, which is a phenomenon that deep neural networks (DNNs) memorize clean data before noisy ones. While the memorization effect can be a powerful tool, there are several cases where memorization effect does not occur. Examples are imbalanced class distributions and heavy contamination on labels. To address this limitation, we introduce a whole new approach called the interpolation with the over-fitted model (IOFM), which leverages over-fitted deep neural networks. The IOFM utilizes a new finding of over-fitted DNNs: for a given training sample, its neighborhoods chosen from the feature space are distributed differently on the original input space depending on the cleanness of the target sample. The IOFM has notable features in two aspects: 1) it yields superior results even when the training data are imbalanced or heavily noisy, 2) since we utilize over-fitted deep neural networks, a fine-tuning procedure to select the optimal training epoch, which is an essential yet sensitive factor for the success of the memorization effect, is not required, and thus, the IOFM can be used for non-experts. Through extensive experiments, we show that our method can serve as a promising alternative to existing solutions dealing with noisy labels, offering improved performance even in challenging situations.

Full Text