Abstract

Visual odometry plays an important role in urban autonomous driving cars. Feature-based visual odometry methods sample the candidates randomly from all available feature points, while alignment-based visual odometry methods take all pixels into account. These methods hold an assumption that quantitative majority of candidate visual cues could represent the truth of motions. But in real urban traffic scenes, this assumption could be broken by lots of dynamic traffic participants. Big trucks or buses may occupy the main image parts of a front-view monocular camera and result in wrong visual odometry estimation. Finding available visual cues that could represent real motion is the most important and hardest step for visual odometry in the dynamic environment. Semantic attributes of pixels could be considered as a more reasonable factor for candidate selection in that case. This article analyzed the availability of all visual cues with the help of pixel-level semantic information and proposed a new visual odometry method that combines feature-based and alignment-based visual odometry methods with one optimization pipeline. The proposed method was compared with three open-source visual odometry algorithms on Kitti benchmark data sets and our own data set. Experimental results confirmed that the new approach provided effective improvement both on accurate and robustness in the complex dynamic scenes.

Highlights

  • Visual odometry (VO) is the most important part of visual simultaneous location and mapping (V-SLAM) algorithm and has already been widely used in the optical mouses, small mobile robot, and unmanned aerial vehicles (UAVs)

  • The VO term had been first used by Nister in 2004,1 but the relevant researches in this area had been focused over 30 years.[2,3]

  • The intelligent vehicle platform was retrofitted from a Changan Raeton car, equipped with two AVT® 1394 Pike F-200c cameras capturing front view stereo images, one OxTS inertial IMU and Novatel RTK-GPS, and Velodyne VLP-16 LIDAR on ORB_SLAM2, and direct sparse odometry (DSO) failed directly, and VISO could not provide any reasonable trajectories

Read more

Summary

Introduction

Visual odometry (VO) is the most important part of visual simultaneous location and mapping (V-SLAM) algorithm and has already been widely used in the optical mouses, small mobile robot, and unmanned aerial vehicles (UAVs). Deep learning had been used successful in object detection and image semantic segmentation[23] and spatial semantics learning.[24] These semantic information could provide more causal factors for visual motion estimation and helped to improve robustness in complex environment.[25] Mohanty et al proposed a deep VO method that estimated the odometry vectors between any arbitrary image pair by a trained convolutional neural network (CNN).[26] These efforts could provide a better way to look inside how VO using features or pixels and make evaluation method of VO toward the way that humans could understand. In section “Conclusions,” we concluded the method and lined out future work

Related work
Experimental results
Conclusions
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.