Abstract

Due to the projective nature of a single camera, monocular visual simultaneous localization and mapping (vSLAM) algorithms suffer from scale drift. Recent studies have addressed the scale drift by estimating the height of the camera from the fitted ground plane. However, the low accuracy of 3D ground point (3DGP) reconstruction and the insufficient number of 3DGPs from the textureless ground region make the fitted plane inaccurate for scale estimation. In this paper, we introduce a novel accuracy measure of the fitted plane. From the previous keyframe, we first obtain the 3D points from the intersection of the fitted plane with a ray passing through the camera center and the 2D feature points on the ground region. We then transform the 3D points and 2D feature points to the current keyframe using their respective planar homography. Finally, we project the transformed 3D points onto the image plane of the current keyframe and measure the average Euclidean distance between the projected points and the transformed feature points. The proposed method achieves the average translation error of 1.03% on the KITTI dataset, which outperforms the state-of-the-art monocular vSLAM methods.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call