Abstract

The homogeneous transformation between a LiDAR and monocular camera is required for sensor fusion tasks, such as SLAM. While determining such a transformation is not considered glamorous in any sense of the word, it is nonetheless crucial for many modern autonomous systems. Indeed, an error of a few degrees in rotation or a few percent in translation can lead to 20 cm translation errors at a distance of 5 m when overlaying a LiDAR image on a camera image. The biggest impediments to determining the transformation accurately are the relative sparsity of LiDAR point clouds and systematic errors in their distance measurements. This paper proposes (1) the use of targets of known dimension and geometry to ameliorate target pose estimation in face of the quantization and systematic errors inherent in a LiDAR image of a target, and (2) a fitting method for the LiDAR to monocular camera transformation that fundamentally assumes the camera image data is the most accurate information in one's possession.

Highlights

  • AND RELATED WORKThe desire to produce 3D-semantic maps [1] with our Cassieseries bipedal robot [2] has motivated us to fuse 3D-LiDAR and RGB-D monocular camera data for autonomous navigation [3]

  • The map fuses LiDAR, camera, and IMU data; with the addition of a simple planner, it led to autonomous navigation [41]

  • We proposed a new method to determine the extrinsic calibration of a LiDAR camera pair

Read more

Summary

AND RELATED WORK

The desire to produce 3D-semantic maps [1] with our Cassieseries bipedal robot [2] has motivated us to fuse 3D-LiDAR and RGB-D monocular camera data for autonomous navigation [3]. An error of a few degrees in rotation or a few percent in translation in the estimated rigid-body transformation between LiDAR and camera can lead to 20 cm reprojection errors at a distance of 5 m when overlaying a LiDAR point cloud on a camera image. Such errors will lead to navigation errors. I where Xi are the (homogeneous) coordinates of the LiDAR features, Yi are the coordinates of the camera features, P is the often-called ‘‘projection map’’, HLC is the (homogeneous representation of) the LiDAR-frame to camera-frame transformation with rotation matrix RLC and translation TCL , and dist is a distance or error measure

ROUGH OVERVIEW OF THE MOST COMMON TARGET-BASED APPROACHES
REMARKS ON LiDAR POINT CLOUDS
NEW METHOD FOR DETERMINING TARGET VERTICES
IMAGE PLANE CORNERS AND CORRESPONDENCES
EXTRINSIC TRANSFORMATION OPTIMIZATION
EXPERIMENTAL RESULTS
CAMERA CORNERS AND ASSOCIATIONS
EXTRINSIC CALIBRATION
QUANTITATIVE RESULTS AND ROUND-ROBIN ANALYSIS
QUALITATIVE RESULTS AND DISCUSSION
CONCLUSIONS
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call