Semantic RGB-D SLAM for Rescue Robot Navigation

Kaihong Huang,Zhiqian Zhou,Chenghao Shi,Ruibin Guo,Wenbang Deng,Xieyuanli Chen,Hui Zhang

doi:10.1109/access.2020.3031867

Kaihong Huang, Zhiqian Zhou + Show 5 more

Open Access

https://doi.org/10.1109/access.2020.3031867

Copy DOI

Abstract

In this paper, we propose a semantic simultaneous localization and mapping (SLAM) framework for rescue robots, and report its use in navigation tasks. Our framework can generate not only geometric maps in the form of dense point-clouds but also corresponding point-wise semantic labels generated by a semantic segmentation convolutional neural network (CNN). The semantic segmentation CNN is trained using our RGB-D dataset of the RoboCup Rescue-Robot-League (RRL) competition environment. With the help of semantic information, the rescue robot can identify different types of terrains in a complex environment, so as to avoid specific obstacles or to choose routes with better traversability. To reduce the segmentation noise, our approach utilizes depth images to perform filtering on the segmentation results of each frame. The overall semantic map is then further improved in the point-cloud voxels. By accumulating results of multiple frames in the voxels, semantic maps with consistent semantic labels are obtained. To show the advantage of having a semantic map of the environment, we report a case study of how the semantic map can be utilized in a navigation task to reduce the arrival time while ensuring safety. The experimental result shows that our semantic SLAM framework is capable of generating a dense semantic map for the complex RRL competition environment, with which the arrival time of the navigation time is effectively reduced.

Highlights

Semantic information representing classes of objects allows robots to understand their surroundings in a higher level other than geometry or appearance
We propose a semantic simultaneous localization and mapping (SLAM) framework for rescue robots to better navigate through challenging environments that comprise complex terrains other than flat ground
We utilize the depth information to determine whether neighboring pixels in the semantic image belong to the same object, so as to improve the precision of semantic segmentation

Summary

INTRODUCTION

Semantic information representing classes of objects allows robots to understand their surroundings in a higher level other than geometry or appearance. With the help of semantic information, robots can perform better in tasks like path planning and human-robot interaction, etc. In the RoboCup Rescue-Robot-League (RRL) competition – an international competition for evaluating the performance of rescue robots – the contestants need to autonomously traverse and generate maps for a maze consisting of challenging terrains, such as stairs, stepfields, elevated slopes, and steep ramps. We propose a semantic simultaneous localization and mapping (SLAM) framework for rescue robots. Our framework combines the well-known ORB-SLAM2 [1] method and a convolutional neural network (CNN) to generate both geometric and semantic maps of dense point-cloud, using an RGB-D camera. We accumulate local point-clouds of multiple frames into voxels and determine the most frequently appeared semantic label for each voxel. Apart from common RGB semantic segmentation algorithms mentioned above, there are RGB-D based [9]–[13] and pointcloud based [14] approaches

SIMULTANEOUS LOCALIZATION AND MAPPING

SEMANTIC SEGMENTATION

CONCLUSION