Abstract

Visual-inertial simultaneous localization and mapping (VI-SLAM) is popular research topic in robotics. Because of its advantages in terms of robustness, VI-SLAM enjoys wide applications in the field of localization and mapping, including in mobile robotics, self-driving cars, unmanned aerial vehicles, and autonomous underwater vehicles. This study provides a comprehensive survey on VI-SLAM. Following a short introduction, this study is the first to review VI-SLAM techniques from filtering-based and optimization-based perspectives. It summarizes state-of-the-art studies over the last 10 years based on the back-end approach, camera type, and sensor fusion type. Key VI-SLAM technologies are also introduced such as feature extraction and tracking, core theory, and loop closure. The performance of representative VI-SLAM methods and famous VI-SLAM datasets are also surveyed. Finally, this study contributes to the comparison of filtering-based and optimization-based methods through experiments. A comparative study of VI-SLAM methods helps understand the differences in their operating principles. Optimization-based methods achieve excellent localization accuracy and lower memory utilization, while filtering-based methods have advantages in terms of computing resources. Furthermore, this study proposes future development trends and research directions for VI-SLAM. It provides a detailed survey of VI-SLAM techniques and can serve as a brief guide to newcomers in the field of SLAM and experienced researchers looking for possible directions for future work.

Highlights

  • Simultaneous localization and mapping (SLAM) technology was first proposed by Smith [1,2], which was applied in robotics with the goal of building a real-time map of surroundings based on sensor data in an unknown environment as the sensor positioned itself

  • The appearance-based approach determines the loop closure relationship to eliminate the cumulative error according to the similarity of two images, and it has relationship to eliminate the cumulative error according to the similarity of two images, and it has been used successfully in Visual-inertial simultaneous localization and mapping (VI-SLAM) systems [18,31,60]

  • Different VI-SLAM methods are designed for different applications and it is hard to while it is manually piloted around three different indoor environments

Read more

Summary

Introduction

Simultaneous localization and mapping (SLAM) technology was first proposed by Smith [1,2], which was applied in robotics with the goal of building a real-time map of surroundings based on sensor data in an unknown environment as the sensor positioned itself. New methods have appeared using different sensors such as sonar [3], lidar [4], and cameras [5]. Maplab is a filtering-based VI-SLAM system that provides the research community with a collection of multi-session mapping tools including map merging, loop closure, and visual-inertial optimization. VINS-mono is a real-time optimization-based VI-SLAM system that uses a sliding window to provide high-precision odometry. It features efficient IMU pre-integration with bias correction, automatic estimator initialization, online extrinsic calibration, failure detection, and loop detection. This work summarizes research over the previous 10 years and famous VI-SLAM datasets and compares filtering-based and optimization-based methods through experiments. Potential development trends and forthcoming research directions are introduced

Filtering-Based Methods
Methods
Feature Tracking
Dynamic and Observational Models
Filtering-Based VIO and VI-SLAM
Optimization-Based Methods
Loop Closure
Optimization-Based VI-SLAM Algorithms
Details
Experiments
SLAM with Deep Learning
Hardware Integration and Multi-Sensor Fusion
Active SLAM on Robots
Applications on Complex Dynamic Environments
Findings
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call