Abstract

The past decade, we have witnessed rapid progress in compact representation learning for fast image retrieval. In the unsupervised scenario, product quantization (PQ) is one of the promising methods to generate compact image representation for fast and accurate retrieval. Inspired by the great success of deep neural network (DNN) achieved in computer vision, many works attempted to integrate PQ in DNN for end-to-end supervised training. Nevertheless, in existing deep PQ methods, data samples from different classes share the same codebook. Thus, they might be entangled with each other in the feature space. Meanwhile, existing deep PQ methods relying on triplet or pairwise loss require a huge number of training triplets or pairs, which are expensive in computation and scale poorly. In this work, we propose a multiple exemplars learning (MEL) approach to improve retrieval accuracy and training efficiency. For each class, we learn a class-specific codebook consisting of multiple exemplars to partition the class-specific feature space. Since the feature space as well as the codebook is class-specific, samples of different classes are disentangled in the feature space. We incorporate the proposed MEL in a convolutional neural network, supporting end-to-end training. Moreover, we propose MEL loss which trains the network in a considerably more efficient manner than existing deep product quantization approaches based on pairwise or triplet loss. Systematic experiments conducted on two public benchmarks demonstrate the effectiveness and efficiency of our method.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call