Abstract

As one of the typical application-oriented solutions to robot autonomous navigation, visual simultaneous localization and mapping is essentially restricted to simplex environmental understanding based on geometric features of images. By contrast, the semantic simultaneous localization and mapping that is characterized by high-level environmental perception has apparently opened the door to apply image semantics to efficiently estimate poses, detect loop closures, build 3D maps, and so on. This article presents a detailed review of recent advances in semantic simultaneous localization and mapping, which mainly covers the treatments in terms of perception, robustness, and accuracy. Specifically, the concept of “semantic extractor” and the framework of “modern visual simultaneous localization and mapping” are initially presented. As the challenges associated with perception, robustness, and accuracy are being stated, we further discuss some open problems from a macroscopic view and attempt to find answers. We argue that multiscaled map representation, object simultaneous localization and mapping system, and deep neural network-based simultaneous localization and mapping pipeline design could be effective solutions to image semantics-fused visual simultaneous localization and mapping.

Highlights

  • Autonomous robots are capable of performing specific tasks independently without any human interventions

  • For autonomous robot navigation tasks, a semantic SLAM that aims at better understanding and perceiving a message from the robot work volume has drawn an increasing attention

  • We review the development of semantic SLAM concerning its perception, robustness, and accuracy and discuss the open problems associated with the recent progress and challenges

Read more

Summary

Introduction

Autonomous robots are capable of performing specific tasks independently without any human interventions. Research[79,80,81,82,83,84] attempted to construct superior pixel-level semantic maps via applying some traditional tools, like SVM (even though SVM is commonly used in addressing industrial problems of prediction,[85,86,87] classification,[88] or fault diagnosis89), CRF, and so on, since these tools are considered to be useful for object identification and scene segmentation. Inspired by the advances in deep learning, there has been more research in the area of CNN-based object identification, detection, and segmentation.[90,91,92] The sufficient achievements subsequently provide a guarantee for constructing more accurate semantic maps with pixel level.[93].

Method Feature selection
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call