Abstract

Learning robust feature descriptors from multiple views plays an important role in image matching and its downstream tasks. Due to various uncertainty factors from illumination to viewpoint, robust feature matching remains a challenging task; the generalization property of learned feature descriptors has also remained poorly understood. Our intuition is that ensemble learning can assemble a collection of feature descriptors by mining richer information for each pixel, which can lead to improved robustness and generalization properties. In this paper, we propose a new image feature description method named Orthogonal Descriptor Network (OD-Net), which describes a pixel by a multi-branch structure and then fuses them. To avoid the model being trapped into an ill-posed solution and encourage the network to mine complementary information, we have designed an orthogonality constraint and modeled it as a novel loss function. It is worth noting that the idea of orthogonal feature extraction is general and can be easily plugged into many existing frameworks. Extensive experiments have been carried out to show that our OD-Net can produce better results than the current state-of-the-art on a variety of image-matching tasks and evaluation metrics.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call