Abstract

In this paper, we formulate a novel strategy to adapt monocular-vision-based simultaneous localization and mapping (vSLAM) to dynamic environments. When enough background features can be captured, our system not only tracks the camera trajectory based on static background features but also estimates the foreground object motion from object features. In cases when a moving object obstructs too many background features for successful camera tracking from the background, our system can exploit the features from the object and the prediction of the object motion to estimate the camera pose. We use various synthetic and real-world test scenarios and the well-known TUM sequences to evaluate the capabilities of our system. The experiments show that we achieve higher pose estimation accuracy and robustness over state-of-the-art monocular vSLAM systems.

Highlights

  • We extend monocular ORB-SLAM2 [4] for adaptation to dynamic environments based on the assumptions that the moving objects in the scene are rigid bodies and that their motion is predictable over a few video frames

  • We present a novel monocular vision-based simultaneous localization and mapping (vSLAM) system with deep-learning-based semantic segmentation to reduce the impact of dynamic objects and use the dynamic features to improve the accuracy and robustness; We present a method to estimate camera pose and object motion simultaneously in addition to ORB-SLAM2 [4] for monocular cameras; We propose a strategy to recover the camera pose from the predicted object motion if a moving object obstructs enough background features, such that tracking from the background alone is impossible; and

  • We constructed DOE-simultaneous localization and mapping (SLAM), a monocular vSLAM system that builds on ORB-SLAM2

Read more

Summary

Introduction

We extend monocular ORB-SLAM2 [4] for adaptation to dynamic environments based on the assumptions that the moving objects in the scene are rigid bodies and that their motion is predictable over a few video frames. We present a novel monocular vSLAM system with deep-learning-based semantic segmentation to reduce the impact of dynamic objects and use the dynamic features to improve the accuracy and robustness; We present a method to estimate camera pose and object motion simultaneously in addition to ORB-SLAM2 [4] for monocular cameras; We propose a strategy to recover the camera pose from the predicted object motion if a moving object obstructs enough background features, such that tracking from the background alone is impossible; and.

Related Work
Classic SLAM
Dynamic SLAM
Method
Object Modeling
Object Motion Estimation and Optimization
Camera Pose from Object Motion
Experiments
Motion of Previously Static Objects
Fully Dynamic Object
TUM Dataset
Real-World Test Cases
Findings
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call