Abstract
Scene reconstruction and visual localization in dynamic environments such as street scenes are a challenge due to the lack of distinctive, stable keypoints. While learned convolutional features have proven to be robust to changes in viewing conditions, handcrafted features still have advantages in distinctiveness and accuracy when applied to structure from motion. For collaborative reconstruction of road sections by a car fleet, we propose to use multimodal domain adaptation as a preprocessing step to align images in their appearance and enhance keypoint matching across viewing conditions while preserving the advantages of handcrafted features. Training a generative adversarial network for translations between different illumination and weather conditions, we evaluate qualitative and quantitative aspects of domain adaptation and its impact on feature correspondences. Combined with a multi-feature discriminator, the model is optimized for synthesis of images which do not only improve feature matching but also exhibit a high visual quality. Experiments with a challenging multi-domain dataset recorded in various road scenes on multiple test drives show that our approach outperforms other traditional and learning-based methods by improving completeness or accuracy of structure from motion with multimodal two-domain image collections in eight out of ten test scenes.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.