In this paper, we proposed Mix-VIO, a monocular and binocular visual-inertial odometry, to address the issue where conventional visual front-end tracking often fails under dynamic lighting and image blur conditions. Mix-VIO adopts a hybrid tracking approach, combining traditional handcrafted tracking techniques with Deep Neural Network (DNN)-based feature extraction and matching pipelines. The system employs deep learning methods for rapid feature point detection, while integrating traditional optical flow methods and deep learning-based sparse feature matching methods to enhance front-end tracking performance under rapid camera motion and environmental illumination changes. In the back-end, we utilize sliding window and bundle adjustment (BA) techniques for local map optimization and pose estimation. We conduct extensive experimental validations of the hybrid feature extraction and matching methods, demonstrating the system's capability to maintain optimal tracking results under illumination changes and image blur.
Read full abstract