Abstract

Static environment is a prerequisite for most existing vision-based SLAM (simultaneous localization and mapping) systems to work properly, which greatly limits the use of SLAM in real-world environments. The quality of the global point cloud map constructed by the SLAM system in a dynamic environment is related to the camera pose estimation and the removal of noise blocks in the local point cloud maps. Most dynamic SLAM systems mainly improve the accuracy of camera localization, but rarely study on noise blocks removal. In this paper, we proposed a novel semantic SLAM system with a more accurate point cloud map in dynamic environments. We obtained the masks and bounding boxes of the dynamic objects in the images by BlitzNet. The mask of a dynamic object was extended by analyzing the depth statistical information of the mask in the bounding box. The islands generated by the residual information of dynamic objects were removed by a morphological operation after geometric segmentation. With the bounding boxes, the images can be quickly divided into environment regions and dynamic regions, so the depth-stable matching points in the environment regions are used to construct epipolar constraints to locate the static matching points in the dynamic regions. In order to verify the preference of our proposed SLAM system, we conduct the experiments on the TUM RGB-D datasets. Compared with the state-of-the-art dynamic SLAM systems, the global point cloud map constructed by our system is the best.

Highlights

  • Simultaneous localization and Mapping (SLAM) plays an important role in the field of autonomous robots and unmanned vehicles [1]

  • Halfsphere means that the camera motion following a halfsphere-like trajectory; rpy means that the camera rotated along the roll-pitch-yaw axes; static means that the camera roughly kept in place manually; xyz means that the camera moved along the x-y-z axes

  • The bounding boxes and masks of the potential dynamic objects could be obtained with BlitzNet, and the image can be quickly divided into environment regions and dynamic regions by the bounding boxes

Read more

Summary

INTRODUCTION

Simultaneous localization and Mapping (SLAM) plays an important role in the field of autonomous robots and unmanned vehicles [1]. Since the existing semantic segmentation algorithms are not perfect [35], some information of the dynamic objects will leak into the environment, and this information will be retained in the constructed local point cloud maps to form a large number of noise blocks. If a dynamic object is far away from the camera, or the segmentation algorithm performs poorly on a certain type of object, the area of the obtained mask would be very small, so the depth values in the mask would not be enough to represent the depth information of the dynamic object effectively In this case, we remove the information in the whole bounding box of the dynamic object to eliminate its impact on the local point cloud map, as follows: SDmOas(ki) < τ1, SDmOas(ki) = SDBBOX(i). The dynamic object mask extension algorithm processing is shown in Algorithm 1

INTERACTION JUDGMENT BETWEEN POTENTIAL DYNAMIC OBJECT AND DYNAMIC OBJECT
GEOMETRIC SEGMENTATION OF DEPTH IMAGE
FEATURE POINTS WITH STABLE DEPTH VALUES
LOCATION OF STATIC MATCHING POINTS
EXPERIMENTAL RESULTS
CONCLUSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call