Abstract

As one of the core technologies for autonomous mobile robots, Visual Simultaneous Localization and Mapping (VSLAM) has been widely researched in recent years. However, most state-of-the-art VSLAM adopts a strong scene rigidity assumption for analytical convenience, which limits the utility of these algorithms for real-world environments with independent dynamic objects. Hence, this paper presents a semantic and geometric constraints VSLAM (SGC-VSLAM), which is built on the RGB-D mode of ORB-SLAM2 with the addition of dynamic detection and static point cloud map construction modules. In detail, a novel improved quadtree-based method was adopted for SGC-VSLAM to enhance the performance of the feature extractor in ORB-SLAM (Oriented FAST and Rotated BRIEF-SLAM). Moreover, a new dynamic feature detection method called semantic and geometric constraints was proposed, which provided a robust and fast way to filter dynamic features. The semantic bounding box generated by YOLO v3 (You Only Look Once, v3) was used to calculate a more accurate fundamental matrix between adjacent frames, which was then used to filter all of the truly dynamic features. Finally, a static point cloud was estimated by using a new drawing key frame selection strategy. Experiments on the public TUM RGB-D (Red-Green-Blue Depth) dataset were conducted to evaluate the proposed approach. This evaluation revealed that the proposed SGC-VSLAM can effectively improve the positioning accuracy of the ORB-SLAM2 system in high-dynamic scenarios and was also able to build a map with the static parts of the real environment, which has long-term application value for autonomous mobile robots.

Highlights

  • Simultaneous Localization and Mapping (SLAM), a prerequisite for many robotic applications, involves a system that simultaneously completes the positioning of the mobile robot itself and the map construction of the surrounding environment without any prior environmental information [1,2].Visual SLAM (VSLAM), where the primary sensor is a camera, has received increasingly more attention in recent years and can be classified into two methods: Feature-based methods [3,4,5] and direct methods [6,7]

  • To improve the stability and robustness of SLAM system in dynamic indoor environments, this paper proposes a semantically and geometrically constrained Visual Simultaneous Localization and Mapping (VSLAM), i.e., semantic and geometric constraints’ algorithm (SGC)-VSLAM, which adopts a novel deep learning and geometric combination to filter out outliers and, at the same time, generate a static point cloud map

  • The purpose of the first test was to verify the position accuracy and the stability of the SLAM system in a dynamic indoor environment, and the second experiment was done to test the competency of static map creation when the scene contains independent dynamic objects

Read more

Summary

Introduction

Simultaneous Localization and Mapping (SLAM), a prerequisite for many robotic applications, involves a system that simultaneously completes the positioning of the mobile robot itself and the map construction of the surrounding environment without any prior environmental information [1,2]. Visual SLAM (VSLAM), where the primary sensor is a camera, has received increasingly more attention in recent years and can be classified into two methods: Feature-based methods [3,4,5] and direct methods [6,7]. The feature-based method extracts the features in each frame to estimate a self-pose that has better environmental adaptability, whereas direct methods estimate the pose by adopting the minimization of photometric errors, which are more sensitive to light changes compared to feature-based methods. Some problems in VSLAM have not been adequately solved until now [8,9].

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call