Abstract

Visual simultaneous localization and mapping (SLAM) is challenging in dynamic environments as moving objects can impair camera pose tracking and mapping. This paper introduces a method for robust dense bject-level SLAM in dynamic environments that takes a live stream of RGB-D frame data as input, detects moving objects, and segments the scene into different objects while simultaneously tracking and reconstructing their 3D structures. This approach provides a new method of dynamic object detection, which integrates prior knowledge of the object model database constructed, object-oriented 3D tracking against the camera pose, and the association between the instance segmentation results on the current frame data and an object database to find dynamic objects in the current frame. By leveraging the 3D static model for frame-to-model alignment, as well as dynamic object culling, the camera motion estimation reduced the overall drift. According to the camera pose accuracy and instance segmentation results, an object-level semantic map representation was constructed for the world map. The experimental results obtained using the TUM RGB-D dataset, which compares the proposed method to the related state-of-the-art approaches, demonstrating that our method achieves similar performance in static scenes and improved accuracy and robustness in dynamic scenes.

Highlights

  • Published: 11 January 2021Nowadays, intelligent robots are widely used in industry, and have extensive application prospects in various environments coexisting with humans

  • This paper presented a new object-level RGB-D dense simultaneous localization and mapping (SLAM) approach for accurate motion estimation of an RGB-D camera in the presence of moving objects, while constructing an object-level semantic map of the scene

  • Object motion detection is performed in 3D space

Read more

Summary

Introduction

Intelligent robots are widely used in industry, and have extensive application prospects in various environments coexisting with humans. Visual simultaneous localization and mapping (SLAM) systems have been focused on jointly solving the tasks of tracking the position of a camera as it explores unknown locations and creating a 3D map of the environment. With the rapid development of computer vision, visual SLAM has attracted increasing attention due to its diverse application scenarios, low cost, and ability to extract semantic information. The wide availability of affordable structured light, time-of-flight depth sensors, and computer hardware upgrades have had enormous impacts on both the democratization of the acquisition of 3D models in real-time from hand-held cameras and providing robots with powerful but low-cost 3D sensing capabilities, and triggered extensive research aimed at real-time.

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call