Abstract

Visual descriptor learning seeks a projection to embed local descriptors (e.g., SIFT descriptors) into a new Euclidean space where pairs of descriptors (positive pairs) are better separated from pairs of descriptors (negative pairs). The original descriptors often confuse the positive pairs with the negative pairs, since local points labeled non-matching yield descriptors close together (irrelevant-near) or local points labeled matching yield descriptors far apart (relevant-far). This is because images differ in terms of viewpoint, resolution, noise, and illumination. In this paper, we formulate an embedding as a regularized discriminant analysis, which emphasizes relevant-far pairs and irrelevant-near pairs to better separate negative pairs from positive pairs. We then extend our method to nonlinear mapping by employing recent work on explicit kernel mapping. Experiments on object retrieval for landmark buildings in Oxford and Paris demonstrate the high performance of our method, compared to existing methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call