Abstract
Visual Simultaneous Localization and Mapping (VSLAM) has developed as the basic ability of robots in past few decades. There are a lot of open-sourced and impressive SLAM systems. However, the majority of the theories and approaches of SLAM systems at present are based on the static scene assumption, which is usually not practical in reality because moving objects are ubiquitous and inevitable under most circumstances. In this paper the DDL-SLAM (Dynamic Deep Learning SLAM) is proposed, a robust RGB-D SLAM system for dynamic scenarios that, based on ORB-SLAM2, adds the abilities of dynamic object segmentation and background inpainting. We are able to detect moving objects utilizing both semantic segmentation and multi-view geometry. Having a static scene map allows inpainting background of the frame which has been obscured by moving objects, therefore the localization accuracy is greatly improved in the dynamic environment. Experiment with a public RGB-D benchmark dataset, the results clarify that DDL-SLAM can significantly enhance the robustness and stability of the RGB-D SLAM system in the highly-dynamic environment.
Highlights
Simultaneous Localization and Mapping (SLAM) is a precondition for some robot applications, such as industrial automation, autonomous vehicles, and collision-less navigation
Some RGB-D SLAM systems deal with moving targets in challenging dynamic scenes in the literatures [28]–[32].Our goal is to enhance the robustness and stability of RGB-D SLAM based on ORB-SLAM2 [33] in highly-dynamic scenarios
EXPERIMENTAL RESULTS The DDL-SLAM system has been evaluated in the public datasets TUM RGB-D
Summary
Simultaneous Localization and Mapping (SLAM) is a precondition for some robot applications, such as industrial automation, autonomous vehicles, and collision-less navigation. The SLAM technology was first put forward by Smith et al [1], [2] in 1986.The autonomous robot estimates the pose utilizing data attained by distinct sensors and information of previous locations during it travels around in an uncharted scene, while building incrementally a consistent map of the scene in the meantime. Visual SLAM, where the camera is used as the unique exteroceptive sensor, has been extensively investigated over the last years. It uses images as the unique source of external environment information [4], because images contain a large amount of useful information and may be applied to other visual applications, such as semantic segmentation, object detection and tracking.
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have