Abstract
Monocular visual SLAM methods can accurately track the camera pose and infer the scene structure by building sparse correspondence between two/multiple views of the scene. However, the reconstructed 3D maps of these methods are extremely sparse. On the other hand, deep learning is widely used to predict dense depth maps from single-view color images, but the results are subject to blurry depth boundaries, which severely deform the structure of 3D scene. Therefore, this paper proposes a dense reconstruction method under the monocular SLAM framework (DRM-SLAM), in which a novel scene depth fusion scheme is designed to fully utilize both the sparse depth samples from monocular SLAM and predicted dense depth maps from convolutional neural network (CNN). In the scheme, a CNN architecture is carefully designed for robust depth estimation. Besides, our approach also accounts for the problem of scale ambiguity existing in the monocular SLAM. Extensive experiments on benchmark datasets and our captured dataset demonstrate the accuracy and robustness of the proposed DRM-SLAM. The evaluation of runtime and adaptability under challenging environments also verify the practicability of our method.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.