Abstract
Deep learning-based camera pose regression approaches have achieved outstanding performance for visual indoor localisation. However, these approaches are limited by the availability of images with known camera poses, and they often require a comprehensive mapping of the indoor scenes, which is labour-intensive and often impractical. Recent studies have shown that synthetic images derived from simple 3D building models can be used to train deep learning models to perform cross-domain synthetic-to-real visual localisation. But the performance of such cross-domain localisation models is degraded due to the domain gap between the real and synthetic images. In this study, we propose a domain adaptation approach based on a Generative Adversarial Network (GAN) framework, which is trained on a set of unpaired synthetic and real images to generate real-looking synthetic images and synthetic-looking real images. These images are then used to train the current deep learning-based camera pose regression networks. We develop two domain adaptation strategies for the localisation task, namely, synthetic-to-real and real-to-synthetic image style transfer. The results show that the best model is obtained for the real-to-synthetic style transfer with a localisation error of 0.31 m and orientation error of 3.36°, which is approximately three times lower compared to the previously reported results.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: ISPRS Journal of Photogrammetry and Remote Sensing
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.