Abstract

Seeking reliable correspondences is a fundamental and significant work in computer vision. Recent work has demonstrated that the task can be effectively accomplished by utilizing a deep learning network based on multi-layer perceptrons, which uses the context normalization to deal with the input. However, the context normalization treats each correspondence equally, which will reduce the representation capability of potential inliers. To solve this problem, we propose a novel and effective Local-Global Self-Attention (LAGA) layer based on the self-attention mechanism, to capture contextual information of potential inliers from coarse to fine, and suppress outliers at the same time in processing the input. The global self-attention module is able to capture abundant global contextual information in the whole image, and the local self-attention module is used to obtain rich local contextual information in the local region. After that, to obtain richer contextual information and feature maps with stronger representative capacity, we combine global and local contextual information. The extensive experiments have shown that the networks with our proposed LAGA layer perform better than the original and other comparative networks in outdoor and indoor scenes for outlier removal and camera pose estimation tasks.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.