Abstract
As one of the core technologies for autonomous mobile robots, Visual Simultaneous Localization and Mapping (VSLAM) has been widely researched in recent years. However, most state-of-the-art VSLAM adopts a strong scene rigidity assumption for analytical convenience, which limits the utility of these algorithms for real-world environments with independent dynamic objects. Hence, this paper presents a semantic and geometric constraints VSLAM (SGC-VSLAM), which is built on the RGB-D mode of ORB-SLAM2 with the addition of dynamic detection and static point cloud map construction modules. In detail, a novel improved quadtree-based method was adopted for SGC-VSLAM to enhance the performance of the feature extractor in ORB-SLAM (Oriented FAST and Rotated BRIEF-SLAM). Moreover, a new dynamic feature detection method called semantic and geometric constraints was proposed, which provided a robust and fast way to filter dynamic features. The semantic bounding box generated by YOLO v3 (You Only Look Once, v3) was used to calculate a more accurate fundamental matrix between adjacent frames, which was then used to filter all of the truly dynamic features. Finally, a static point cloud was estimated by using a new drawing key frame selection strategy. Experiments on the public TUM RGB-D (Red-Green-Blue Depth) dataset were conducted to evaluate the proposed approach. This evaluation revealed that the proposed SGC-VSLAM can effectively improve the positioning accuracy of the ORB-SLAM2 system in high-dynamic scenarios and was also able to build a map with the static parts of the real environment, which has long-term application value for autonomous mobile robots.
Highlights
Simultaneous Localization and Mapping (SLAM), a prerequisite for many robotic applications, involves a system that simultaneously completes the positioning of the mobile robot itself and the map construction of the surrounding environment without any prior environmental information [1,2].Visual SLAM (VSLAM), where the primary sensor is a camera, has received increasingly more attention in recent years and can be classified into two methods: Feature-based methods [3,4,5] and direct methods [6,7]
To improve the stability and robustness of SLAM system in dynamic indoor environments, this paper proposes a semantically and geometrically constrained Visual Simultaneous Localization and Mapping (VSLAM), i.e., semantic and geometric constraints’ algorithm (SGC)-VSLAM, which adopts a novel deep learning and geometric combination to filter out outliers and, at the same time, generate a static point cloud map
The purpose of the first test was to verify the position accuracy and the stability of the SLAM system in a dynamic indoor environment, and the second experiment was done to test the competency of static map creation when the scene contains independent dynamic objects
Summary
Simultaneous Localization and Mapping (SLAM), a prerequisite for many robotic applications, involves a system that simultaneously completes the positioning of the mobile robot itself and the map construction of the surrounding environment without any prior environmental information [1,2]. Visual SLAM (VSLAM), where the primary sensor is a camera, has received increasingly more attention in recent years and can be classified into two methods: Feature-based methods [3,4,5] and direct methods [6,7]. The feature-based method extracts the features in each frame to estimate a self-pose that has better environmental adaptability, whereas direct methods estimate the pose by adopting the minimization of photometric errors, which are more sensitive to light changes compared to feature-based methods. Some problems in VSLAM have not been adequately solved until now [8,9].
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have