Abstract

In practical applications, how to achieve a perfect balance between high accuracy and computational efficiency can be the main challenge for simultaneous localization and mapping (SLAM). To solve this challenge, we propose SD-VIS, a novel fast and accurate semi-direct visual-inertial SLAM framework, which can estimate camera motion and structure of surrounding sparse scenes. In the initialization procedure, we align the pre-integrated IMU measurements and visual images and calibrate out the metric scale, initial velocity, gravity vector, and gyroscope bias by using multiple view geometry (MVG) theory based on the feature-based method. At the front-end, keyframes are tracked by feature-based method and used for back-end optimization and loop closure detection, while non-keyframes are utilized for fast-tracking by direct method. This strategy makes the system not only have the better real-time performance of direct method, but also have high accuracy and loop closing detection ability based on feature-based method. At the back-end, we propose a sliding window-based tightly-coupled optimization framework, which can get more accurate state estimation by minimizing the visual and IMU measurement errors. In order to limit the computational complexity, we adopt the marginalization strategy to fix the number of keyframes in the sliding window. Experimental evaluation on EuRoC dataset demonstrates the feasibility and superior real-time performance of SD-VIS. Compared with state-of-the-art SLAM systems, we can achieve a better balance between accuracy and speed.

Highlights

  • Simultaneous localization and mapping (SLAM) plays an important role in self-driving cars, virtual reality, unmanned aerial vehicles (UAV), augmented reality and artificial intelligence [1,2].This technology can provide reliable state estimation for UAV and self-driving cars in GPS-denied environments by relying on its sensors

  • Various types of sensors can be utilized in SLAM, such as stereo camera, lidar, inertial measurement units (IMU), and monocular camera

  • They have significant disadvantages when used individually: the metric scale of stereo camera can be obtained directly by using fixed baseline length, but it can only be estimated accurately in a limited depth range [3]; lidar has high precision in indoor, but it will encounter the reflection problem of glass surface in outdoor [4]; cheap IMUs are extremely susceptible to bias and noise [5]; monocular camera cannot estimate the absolute metric scale [6]

Read more

Summary

Introduction

Simultaneous localization and mapping (SLAM) plays an important role in self-driving cars, virtual reality, unmanned aerial vehicles (UAV), augmented reality and artificial intelligence [1,2]. The direct method considers the entire image or some pixels with a large gradient and directly estimates the camera motion and scene structure by minimizing the photometric error [11,12,13,14]. In [25,26], different semi-direct approaches were proposed for stereo odometry Both methods use feature-based tracking to obtain a motion prior, and perform direct semi-dense or sparse alignment to refine the camera pose. KLT sparse optical flow algorithm, which can further reduce the end, we only need to extract new feature points on the keyframes and track them with KLT sparse calculation while ensuring accuracy. SD-VIS are aretracked trackedby byfeature-based feature-based method, which is used for sliding window non-linear optimization and loop closure detection.

System Framework Overview
Definition of Symbols
IMU Pre-Integration
Visual-Inertial Alignment
Gyroscope Bias Correction
Gravity Vector Refinement
Keyframe Selection
Keyframes Tracking
Non-Keyframes
Adjusting
Adjust
Sliding Window-based Tightly-coupled Optimization Framework
Formulation
C PF o k j where rB e
Visual Re-Projection Errors
Marginalization Strategy
Re-Localization
Marginalization strategy
Accuracy and Robustness Evaluate
Figures andconclusion
Real-Time Performance Evaluate
Loop Closure Detection Evaluate
Figures and
Findings
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call