The Method of Static Semantic Map Construction Based on Instance Segmentation and Dynamic Point Elimination

Jingyu Li,Wenjiang Liu,Rongfen Zhang,Zaiteng Zhang,Yuhong Liu,Runze Fan

doi:10.3390/electronics10161883

Jingyu Li, Wenjiang Liu + Show 4 more

Open Access

https://doi.org/10.3390/electronics10161883

Copy DOI

Abstract

Semantic information usually contains a description of the environment content, which enables mobile robot to understand the environment and improves its ability to interact with the environment. In high-level human–computer interaction application, the Simultaneous Localization and Mapping (SLAM) system not only needs higher accuracy and robustness, but also has the ability to construct a static semantic map of the environment. However, traditional visual SLAM lacks semantic information. Furthermore, in an actual scene, dynamic objects will reduce the system performance and also generate redundancy when constructing map. these all directly affect the robot’s ability to perceive and understand the surrounding environment. Based on ORB-SLAM3, this article proposes a new Algorithm that uses semantic information and the global dense optical flow as constraints to generate dynamic-static mask and eliminate dynamic objects. then, to further construct a static 3D semantic map under indoor dynamic environments, a fusion of 2D semantic information and 3D point cloud is carried out. the experimental results on different types of dataset sequences show that, compared with original ORB-SLAM3, both Absolute Pose Error (APE) and Relative Pose Error (RPE) have been ameliorated to varying degrees, especially on freiburg3-walking-xyz, the APE reduced by 97.78% from the original average value of 0.523, and RPE reduced by 52.33% from the original average value of 0.0193. Compared with DS-SLAM and DynaSLAM, our system improves real-time performance while ensuring accuracy and robustness. Meanwhile, the expected map with environmental semantic information is built, and the map redundancy caused by dynamic objects is successfully reduced. the test results in real scenes further demonstrate the effect of constructing static semantic maps and prove the effectiveness of our Algorithm.

Highlights

SLAM (Simultaneous Localization and Mapping) is a method for intelligent mobile devices to locate the pose and build map of the surrounding environment in unknown scenes
In order to improve the performance of ORB-SLAM3 in a dynamic scene, we propose a new method to improve the accuracy and robustness of visual odometry in a dynamic scene, and, for the goal of providing environmental semantic information, we further construct an indoor static semantic map
According to all of the analyses of the experimental results above, it is not difficult to find that our algorithm effectively improves the performance degradation of ORBSLAM3 in the tracking process caused by the movement of dynamic objects

Summary

Introduction

SLAM (Simultaneous Localization and Mapping) is a method for intelligent mobile devices to locate the pose and build map of the surrounding environment in unknown scenes. It is widely used in many fields, such as unmanned driving, robot, and AR (Augmented Reality). A typical SLAM framework is mainly composed of front-end odometry, back-end pose optimization, and loop detection. According to the different types of sensors used by SLAM system to obtain environmental data, it can be roughly divided into: Laser. The sensor used by Visual SLAM is low in price, and more intuitive to obtain environmental content.

Objectives

Methods

Results

Conclusion