Abstract
Simultaneous localization and mapping(SLAM), focusing on addressing the joint estimation problem of self-localization and scene mapping, has been widely used in many applications such as mobile robot, drone, and augmented reality(AR). However, traditional state-of-the-art SLAM approaches are typically designed under the static-world assumption and prone to be degraded by moving objects when running in dynamic scenes. This article presents a novel semantic visual-inertial SLAM system for dynamic environments that, building on VINS-Mono, performs real-time trajectory estimation by utilizing the pixel-wise results of semantic segmentation. We integrate the feature tracking and extraction framework into the front-end of the SLAM system, which could make full use of the time waiting for the completion of the semantic segmentation module, to effectively track the feature points on subsequent images from the camera. In this way, the system can track feature points stably even in high-speed movement. We also construct the dynamic feature detection module that combines the pixel-wise semantic segmentation results and the multi-view geometric constraints to exclude dynamic feature points. We evaluate our system in public datasets, including dynamic indoor scenes and outdoor scenes. Several experiments demonstrate that our system could achieve higher localization accuracy and robustness than state-of-the-art SLAM systems in challenging environments.
Highlights
I N the past thirty years, with the rapid development of computer science and sensors, simultaneous localization and mapping(SLAM) has become an indispensable technology in many fields, like robotics [1], [2], autonomous driving [3], [4], and Augmented Reality(AR) [5], [6]
Once the camera moves rapidly, the pose estimation is likely to drift due to the feature point tracking failure. To address these problems above, we propose a real-time SLAM system for dynamic environments, which combines semantic segmentation and multi-view geometric constraints, can effectively identify and avoid using the feature points located on dynamic objects
We propose a novel feature point tracking method in our system, which uses multi-threading and mutex to make the feature tracking module track the feature points on all the images from the camera, to prevent the image loss caused by the time-consuming of semantic segmentation network
Summary
I N the past thirty years, with the rapid development of computer science and sensors, simultaneous localization and mapping(SLAM) has become an indispensable technology in many fields, like robotics [1], [2], autonomous driving [3], [4], and Augmented Reality(AR) [5], [6]. Benefited from the development of computer vision, visual SLAM(VSLAM) has attracted the attention of many researchers and companies with its advantages of low cost, low power consumption, and the ability to provide rich information of sceneries. It has become a research hotspot in the field of SLAM. The current state-of-the-art V-SLAM algorithms work well in static environments, they are always prone to failure when confronted with dynamic scenes. Realworld environments, such as shopping malls, streets, and stations, usually have various moving objects. To improve the localization accuracy and robustness of the SLAM system in dynamic environments, it is crucial to avoid the interference of dynamic objects on the system effectively
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.