Abstract

Vision-based localization systems, namely visual odometry (VO) and visual inertial odometry (VIO), have attracted great attention recently. They are regarded as critical modules for building fully autonomous systems. The simplicity of visual and inertial state estimators, along with their applicability in resource-constrained platforms motivated robotic community to research and develop novel approaches that maximize their robustness and reliability. In this paper, we surveyed state-of-the-art VO and VIO approaches. In addition, studies related to localization in visually degraded environments are also reviewed. The reviewed VO techniques and related studies have been analyzed in terms of key design aspects including appearance, feature, and learning based approaches. On the other hand, research studies related to VIO have been categorized based on the degree and type of fusion process into loosely-coupled, semi-tightly coupled, or tightly-coupled approaches and filtering or optimization-based paradigms. This paper provides an overview of the main components of visual localization, key design aspects highlighting the pros and cons of each approach, and compares the latest research works in this field. Finally, a detailed discussion of the challenges associated with the reviewed approaches and future research considerations are formulated.

Highlights

  • Unmanned aerial/ground vehicles (UAV/UGVs) have many advantages such as mobility which incorporates flexibility and strength

  • In this article, we have surveyed most of the state-of-the-art studies related to visual-based localization solutions, namely visual odometry (VO) and visual inertial odometry (VIO), to aid autonomous navigation in Global Navigation Satellite System (GNSS)-denied environments

  • We have conducted a comprehensive review on self-localization techniques for autonomous navigation in visually degraded environments

Read more

Summary

INTRODUCTION

Unmanned aerial/ground vehicles (UAV/UGVs) have many advantages such as mobility which incorporates flexibility and strength. For regional-based method, the motion is estimated by concatenating camera poses by performing an alignment process for two consecutive images This technique has extended its implementation by measuring the invariant similarities of local areas and using global constraints. To achieve high monocular VO accuracy, Jiao et al [101] utilized a learning framework by combining CNN and Bi-LSTM to leverage the feature properties of image pairs and to permit understanding of the relationship between the features of successive images, respectively Another approach to VO problem is the development of RCNN proposed by Liu et al [102], which is a learning-based model in an end-to-end training manner employing RGB-D sensors. Their model outperformed the pose estimation obtained from standard frame based VIO by 85%

DISCUSSION AND FUTURE
CONCLUSION
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.