Image retrieval is a challenging problem that requires learning generalized features enough to identify untrained classes, even with very few classwise training samples. In this article, to obtain generalized features further in learning retrieval data sets, we propose a novel fine-tuning method of pretrained deep networks. In the retrieval task, we discovered a phenomenon in which the loss reduction in fine-tuning deep networks is stagnated, even while weights are largely updated. To escape from the stagnated state, we propose a new fine-tuning strategy to roll back some of the weights to the pretrained values. The rollback scheme is observed to drive the learning path to a gentle basin that provides more generalized features than a sharp basin. In addition, we propose a multihead ensemble structure to create synergy among multiple local minima obtained by our rollback scheme. Experimental results show that the proposed learning method significantly improves generalization performance, achieving state-of-the-art performance on the Inshop and SOP data sets.
Read full abstract