Abstract

Visual localization, a fundamental component of several computer vision tasks, has been predominantly realized by scene coordinate regression (SCoRe) techniques. These methods leverage neural networks for scene coordinates prediction, followed by a PnP algorithm to recover the 6-DOF camera pose. However, similar image patches are prevalent in indoor scenes, which results in the extraction of comparable features for the regression of different scene coordinates. As a result, the localization accuracy is severely declined. In this work, we develop ALNet, a novel SCoRe method that incorporates a local discrepancy perception module (LDPM) and an adaptive channel attention module (ACAM) to address this challenge. For LDPM, our key insight lies in that scene attributes around different similar image patches are inconsistent. Technically, for each image patch, LDPM identifies a certain number of the most dissimilar patches around it and computes difference vectors to enrich its own features, thereby enabling the differentiation of similar image patches. Considering geometric attributes are beneficial for distinguishing similar patches while semantic context is conducive to encoding regression issues, integrating multi-level features is an effective approach to elevate the localization accuracy. Therefore, ACAM concatenates multi-level features together and leverages both average pooling and max pooling to generate reliable channel-wise weighting coefficient, thereby modeling the correlation among channels to integrate multi-level features effectively. Comprehensive experiments are conducted on mainstream indoor localization benchmarks and an actual environment, showing that ALNet achieves impressive performance. Source code and the experimental results video are available at https://github.com/DAMMONGAO/alnet.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.