Abstract

Simultaneous localization and mapping (SLAM) is one of the most essential technologies for mobile robots. Although great progress has been made in the field of SLAM in recent years, there are a number of challenges for SLAM in dynamic environments and high-level semantic scenes. In this paper, we propose a novel multimodal semantic SLAM system (MISD-SLAM), which removes the dynamic objects in the environments and reconstructs the static background with semantic information. MISD-SLAM builds three main processes: instance segmentation, dynamic pixels removal, and semantic 3D map construction. An instance segmentation network is used to provide semantic knowledge of surrounding environments in instance level. The ORB features located on the predefined dynamic objects are removed directly. In this way, MISD-SLAM effectively reduces the impact of dynamic objects to provide precise pose estimation. Then, combining multiview geometry constraint with K -means clustering algorithm, our system removes the undefined but moving pixels. Meanwhile, a 3D dense point cloud map with semantic information is reconstructed, which recovers the static background without the corruptions of dynamic objects. Finally, we evaluate MISD-SLAM by comparing to ORB-SLAM3 and the state-of-the-art dynamic SLAM systems in TUM RGB-D datasets and real-world dynamic indoor environments. The results indicate that our method significantly improves the localization accuracy and system robustness, especially in high-dynamic environments.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call