Abstract

Visual representation for fine-grained visual recognition can be learned by mandatorily enforcing all samples of the same category into a uniform representation. This strict training objective performs well under closed-set setting but is not applicable to data in the wild containing noisy annotations and long-tailed distributions, e.g., it may lead to a feature space biased to head categories. This paper tackles this challenge by pursuing a more balanced and discriminative feature space by first retaining intra-class variances to isolate noises, then eliminating intra-class variances to improve the visual recognition performance. We propose the Compact Memory Updater to maintain a memory bank, which memorizes proxy features to represent multiple typical appearances of each category in the training set. The Proxy-based Feature Enhancement hence leverages proxy features to ensure samples of the same category have similar features. Iteratively running those two modules boosts the robustness and discriminative power of the learnt representation, hence facilitates various fine-grained visual recognition tasks including person re-identification (re-id), image classification and retrieval. Extensive experiments on noisy and long-tailed training sets show this Multi-Proxy Feature Learning (MPFL) framework achieves promising performance. For instance on a training set with 90% one-shot categories, MPFL outperforms the recent long-tailed person re-id method LEAP-AF by 16.9% in rank-1 accuracy.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call