Highly Accurate Scale Estimation from Multiple Keyframes Using RANSAC Plane Fitting With a Novel Scoring Method

Ming Fan,Seung-Won Jung,Sung-Jea Ko

doi:10.1109/tvt.2020.3040014

Abstract

Due to the projective nature of a single camera, monocular visual simultaneous localization and mapping (vSLAM) algorithms suffer from scale drift. Recent studies have addressed the scale drift by estimating the height of the camera from the fitted ground plane. However, the low accuracy of 3D ground point (3DGP) reconstruction and the insufficient number of 3DGPs from the textureless ground region make the fitted plane inaccurate for scale estimation. In this paper, we introduce a novel accuracy measure of the fitted plane. From the previous keyframe, we first obtain the 3D points from the intersection of the fitted plane with a ray passing through the camera center and the 2D feature points on the ground region. We then transform the 3D points and 2D feature points to the current keyframe using their respective planar homography. Finally, we project the transformed 3D points onto the image plane of the current keyframe and measure the average Euclidean distance between the projected points and the transformed feature points. The proposed method achieves the average translation error of 1.03% on the KITTI dataset, which outperforms the state-of-the-art monocular vSLAM methods.

Full Text