Online Extrinsic Calibration of Camera and LiDAR Based on Cascade Optimization.
Accurate and stable extrinsic calibration is the foundation of high-quality fusion sensing and positioning of camera and Light Detection and Ranging (LiDAR). However, traditional targetless calibration methods suffer from limitations such as poor scene adaptability and unstable convergence, which significantly restrict calibration accuracy and robustness in complex environments. Aiming at solving those problems, we propose an online cascade-optimization-based extrinsic calibration method of combining motion trajectory alignment and edge feature alignment. In the initial calibration stage, a hand-eye calibration algorithm is designed by minimizing the residual discrepancies between camera odometry and LiDAR odometry sequences. It establishes a robust initialization for subsequent optimization. Then, in order to extract robust edge line features from sparse point clouds, we employ depth difference and planar edges of point clouds in the optimization process. Subsequently, principal component analysis (PCA) is applied to compute the principal direction of the extracted line features, enabling a decoupled optimization scheme that accounts for directional observability. This approach effectively mitigates the adverse effects of uneven environmental feature distributions. Experimental validation on typical urban datasets demonstrates the method's generalizability and competitive accuracy: rotational parameter errors are constrained within 0.25°, and translational errors are maintained below 0.05 m. This affirms the method's suitability for high-accuracy engineering applications.
- Research Article
- 10.1049/ell2.70423
- Jan 1, 2025
- Electronics Letters
ABSTRACTAccurate extrinsic calibration between the light detection and ranging (LiDAR) and camera is a critical step for sensor fusion tasks. Existing calibration methods often rely on artificial calibration targets or distinct visual textures, which may not be available in many real‐world environments. In addition, conventional LiDAR systems often capture sparse point clouds, which limits feature extraction and matching in calibration tasks. In this work, we propose a novel extrinsic calibration framework that leverages intensity‐aware deep line registration. Our approach first generates dense point clouds by incrementally registering consecutive LiDAR frames and voxel filtering. This dense point cloud serves as the basis for generating high‐resolution intensity maps. Next, we apply deep learning‐based line detection algorithms to extract robust line features from both the intensity map and the corresponding camera image. By minimising a distance‐based objective function formulated with the 3D line points and 2D image lines, we estimate the extrinsic parameters through optimisation process. Experimental results show that our method achieves sub‐pixel reprojection accuracy and robustness in various environments. Our calibration method is cost‐effective, easy to deploy and suitable for real‐time robotic applications without the need for artificial targets.
- Conference Article
11
- 10.1109/cac51589.2020.9327398
- Nov 6, 2020
With Light Detection and Ranging(LiDAR) technology's development and maturity, various new LiDARs have emerged in recent years, such as the Mid-40 series launched by Livox. Compared with the rotating Velodyne LiDAR, its small scanning range and irregular scanning trajectory pose challenges for SLAM applications. We evaluate the characteristics of Mid-40, and propose a calibration method based on checkerboard patterns. Our calibration method extracts the checkerboard corners' coordinates from both the point cloud and the image, which are then used to establish 3D-2D matching constraints. By solving the constraints, we can obtain extrinsic parameters between the LiDAR and camera without the need to label the corner features manually. Experimental results suggested that our method is able to solve the extrinsic calibration problem between Livox Mid-40 LiDAR and the monocular camera with high accuracy.
- Research Article
- 10.1155/2024/2478715
- Jan 1, 2024
- International Journal of Intelligent Systems
With the increasing utilization of cameras and three‐dimensional Light Detection and Ranging (LiDAR) systems in perception tasks, the fusion of these two sensor modalities has emerged as a prominent research focus in the fields of robotics and unmanned systems. While various extrinsic calibration methods have been developed, they often suffer from limited accuracy when using low‐resolution LiDAR sensors and require the placement of calibration targets at multiple locations. This paper introduces a novel calibration target known as the Three‐Dimensional Towered Checkerboard (3TC), along with a precise and straightforward extrinsic calibration approach for camera‐LiDAR systems. The 3TC consists of stacked cubes adorned with planar or 2D checkerboards, which provide the known positions of checkerboard corner points in three‐dimensional space. Leveraging the Iterative Closest Point (ICP) algorithm, the proposed method calculates the spatial relationship between LiDAR point cloud data and the 3TC model to infer the positions of checkerboard corner points in the LiDAR coordinate system. Subsequently, the Perspective‐n‐Point (PnP) algorithm is employed to establish the correlation between corner positions in the LiDAR coordinate system and the camera image, given the intrinsic parameters of the camera. By ensuring an adequate number of cubes and 2D checkerboards on a specific 3TC, along with accurately estimated corner point positions in LiDAR, a single frame of data from both the camera and LiDAR facilitates their extrinsic calibration. Experimental validations conducted across diverse camera and LiDAR systems, achieving minimal error close to the theoretical limit of the devices, attest to the robustness and precision of the 3TC and the proposed calibration methodology.
- Research Article
60
- 10.1109/tim.2013.2258241
- Aug 1, 2013
- IEEE Transactions on Instrumentation and Measurement
Current perception systems of intelligent vehicles not only make use of visual sensors, but also take advantage of depth sensors. Extrinsic calibration of these heterogeneous sensors is required for fusing information obtained separately by vision sensors and light detection and ranging (LIDARs). In this paper, an optimal extrinsic calibration algorithm between a binocular stereo vision system and a 2-D LIDAR is proposed. Most extrinsic calibration methods between cameras and a LIDAR proceed by calibrating separately each camera with the LIDAR. We show that by placing a common planar chessboard with different poses in front of the multisensor system, the extrinsic calibration problem is solved by a 3-D reconstruction of the chessboard and geometric constraints between the views from the stereovision system and the LIDAR. Furthermore, our method takes sensor noise into account that it provides optimal results under Mahalanobis distance constraints. To evaluate the performance of the algorithm, experiments based on both computer simulation and real datasets are presented and analyzed. The proposed approach is also compared with a popular camera/LIDAR calibration method to show the benefits of our method.
- Conference Article
12
- 10.1109/mfi.2012.6343010
- Sep 1, 2012
Visual sensors and depth sensors, such as camera and LIDAR (Light Detection and Ranging) are more and more used together in current perception systems of intelligent vehicles. Fusing information obtained separately from these heterogeneous sensors always requires extrinsic calibration of vision sensors and LIDARs. In this paper, we propose an optimal extrinsic calibration algorithm between a binocular stereo vision system and a 2D LIDAR. The extrinsic calibration problem is solved by 3D reconstruction of a chessboard and geometric constraints between the views from the stereovision system and the LIDAR. The proposed approach takes sensor noise models into account that it provides optimal results under Mahalanobis distance constraints. Experiments based on both computer simulation and real data sets are presented and analyzed to evaluate the performance of the calibration method. A comparison with a popular camera/LIDAR calibration method is also proposed to show the benefits of our method.
- Research Article
8
- 10.1002/arp.1869
- Jun 16, 2022
- Archaeological Prospection
Potential and limitations of LiDAR altimetry in archaeological survey. Copper Age and Bronze Age settlements in southern Iberia
- Research Article
- 10.1109/access.2025.3615993
- Jan 1, 2025
- IEEE Access
In autonomous systems and robotic applications, accurate extrinsic calibration between light detection and ranging (LiDAR) sensors and cameras is crucial for reliable sensor fusion. Several techniques have been developed, including target-based and targetless calibration, but they are either impractical for real-world applications or limited in extracting complex and diverse features. This study presents TransCalib, an innovative deep-learning method for targetless and automatic extrinsic calibration. TransCalib predicts the misalignment between the camera and LiDAR by leveraging EfficientNetV2 to obtain features from the RGB camera image and LiDAR point cloud projection image (depth image), owing to its performance and parameter efficiency.We also developed an innovative feature-matching module that comprises a calibration convolutional feature aggregation block (Calib-CFAB) and a convolutional self-attention (CSA) transformer. Calib-CFAB enriches the combined feature map of the RGB and depth images, while the CSA transformer obtains the correlation in the feature maps. Trained and tested on the KITTI odometry dataset, TransCalib achieved a mean absolute rotation error of 0.14° and a mean translation error of 1.8 cm, outperforming existing methods. The proposed method allows for a robust fusion of LiDAR and camera data, improving the perception abilities of autonomous systems.
- Research Article
45
- 10.3390/geosciences9070323
- Jul 23, 2019
- Geosciences
Digital elevation model (DEM) has been frequently used for the reduction and management of flood risk. Various classification methods have been developed to extract DEM from point clouds. However, the accuracy and computational efficiency need to be improved. The objectives of this study were as follows: (1) to determine the suitability of a new method to produce DEM from unmanned aerial vehicle (UAV) and light detection and ranging (LiDAR) data, using a raw point cloud classification and ground point filtering based on deep learning and neural networks (NN); (2) to test the convenience of rebalancing datasets for point cloud classification; (3) to evaluate the effect of the land cover class on the algorithm performance and the elevation accuracy; and (4) to assess the usability of the LiDAR and UAV structure from motion (SfM) DEM in flood risk mapping. In this paper, a new method of raw point cloud classification and ground point filtering based on deep learning using NN is proposed and tested on LiDAR and UAV data. The NN was trained on approximately 6 million points from which local and global geometric features and intensity data were extracted. Pixel-by-pixel accuracy assessment and visual inspection confirmed that filtering point clouds based on deep learning using NN is an appropriate technique for ground classification and producing DEM, as for the test and validation areas, both ground and non-ground classes achieved high recall (>0.70) and high precision values (>0.85), which showed that the two classes were well handled by the model. The type of method used for balancing the original dataset did not have a significant influence in the algorithm accuracy, and it was suggested not to use any of them unless the distribution of the generated and real data set will remain the same. Furthermore, the comparisons between true data and LiDAR and a UAV structure from motion (UAV SfM) point clouds were analyzed, as well as the derived DEM. The root mean square error (RMSE) and the mean average error (MAE) of the DEM were 0.25 m and 0.05 m, respectively, for LiDAR data, and 0.59 m and –0.28 m, respectively, for UAV data. For all land cover classes, the UAV DEM overestimated the elevation, whereas the LIDAR DEM underestimated it. The accuracy was not significantly different in the LiDAR DEM for the different vegetation classes, while for the UAV DEM, the RMSE increased with the height of the vegetation class. The comparison of the inundation areas derived from true LiDAR and UAV data for different water levels showed that in all cases, the largest differences were obtained for the lowest water level tested, while they performed best for very high water levels. Overall, the approach presented in this work produced DEM from LiDAR and UAV data with the required accuracy for flood mapping according to European Flood Directive standards. Although LiDAR is the recommended technology for point cloud acquisition, a suitable alternative is also UAV SfM in hilly areas.
- Research Article
2
- 10.6574/jprs.2014.19(1).4
- Nov 1, 2014
LiDAR (Light Detection and Ranging) point clouds are measurements of irregularly distributed points on scanned object surfaces acquired with airborne or terrestrial LiDAR systems. Feature extraction is the key to transform LiDAR data into spatial information. Surface features are dominant in most LiDAR data corresponding to scanned object surfaces. This paper proposes a general method to segment co-surface points. An incremental segmentation strategy is developed for the implementation, which comprises several algorithms and employs various criteria to gradually segment LiDAR point clouds into several levels. There are four operation steps. First, the proximity of point clouds is established as spatial indices defined in an octree-structured voxel space. Second, a connected-component labeling algorithm for voxels is applied for segmenting neighboring points. Third, coplanar points then can be segmented with the octree-based split-and-merge algorithm as plane features. Finally, combining neighboring plane features forms surface features. With respect to each step, processed LiDAR point clouds are segmented into organized points, neighboring point groups, coplanar point groups, and co-surface point groups. The proposed method enables an incremental retrieval and analysis of a large LiDAR dataset. Experiment results demonstrate the effectiveness of the segmentation algorithm in handling both airborne and terrestrial LiDAR data. The end results as well as the intermediate results of the segmentation may be useful for object modeling of different purposes using LiDAR data.
- Research Article
3
- 10.3390/s22010106
- Dec 24, 2021
- Sensors (Basel, Switzerland)
Light Detection and Ranging (LiDAR) is a sensor that uses a laser to represent the surrounding environment in three-dimensional information. Thanks to the development of LiDAR, LiDAR-based applications are being actively used in autonomous vehicles. In order to effectively use the information coming from LiDAR, extrinsic calibration which finds the translation and the rotation relationship between LiDAR coordinate and vehicle coordinate is essential. Therefore, many studies on LiDAR extrinsic calibration are steadily in progress. The performance index (PI) of the calibration parameter is a value that quantitatively indicates whether the obtained calibration parameter is similar to the true value or not. In order to effectively use the obtained calibration parameter, it is important to validate the parameter through PI. Therefore, in this paper, we propose an algorithm to obtain the performance index for the calibration parameter between LiDAR and the motion sensor. This performance index is experimentally verified in various environments by Monte Carlo simulation and validated using CarMaker simulation data and real data. As a result of verification, the PI of the calibration parameter obtained through the proposed algorithm has the smallest value when the calibration parameter has a true value, and increases as an error is added to the true value. In other words, it has been proven that PI is convex to the calibration parameter. In addition, it is able to confirm that the PI obtained using the proposed algorithm provides information on the effect of the calibration parameters on mapping and localization.
- Research Article
1
- 10.4028/www.scientific.net/amm.536-537.338
- Apr 1, 2014
- Applied Mechanics and Materials
Light detection and ranging (LIDAR) sensors are widely used in robotics. In this paper, we deal with the extrinsic calibration between camera and rotating LIDAR. It is necessary to fuse information of each system into one common coordinate system. LIDAR sensors give dense and accurate depth information than stereo system. But, camera can give more diverse information including brightness distribution and color about scene. We present extrinsic calibration algorithm using two planes where they are configured vertically compared to each other. A LIDAR is rotated using a motor to have full 3D information about the calibration structure. We find the extrinsic parameter of LIDAR with respect to the world on vertical plane using 3D information. Extrinsic parameter of camera with respect to the world can be found using traditional calibration algorithm. Finally, we can compute the extrinsic parameter between camera and LIDAR. Experimental results show the feasibility of presented algorithm.
- Research Article
73
- 10.3390/s20041102
- Feb 18, 2020
- Sensors (Basel, Switzerland)
Crop 3D modeling allows site-specific management at different crop stages. In recent years, light detection and ranging (LiDAR) sensors have been widely used for gathering information about plant architecture to extract biophysical parameters for decision-making programs. The study reconstructed vineyard crops using light detection and ranging (LiDAR) technology. Its accuracy and performance were assessed for vineyard crop characterization using distance measurements, aiming to obtain a 3D reconstruction. A LiDAR sensor was installed on-board a mobile platform equipped with an RTK-GNSS receiver for crop 2D scanning. The LiDAR system consisted of a 2D time-of-flight sensor, a gimbal connecting the device to the structure, and an RTK-GPS to record the sensor data position. The LiDAR sensor was facing downwards installed on-board an electric platform. It scans in planes perpendicular to the travel direction. Measurements of distance between the LiDAR and the vineyards had a high spatial resolution, providing high-density 3D point clouds. The 3D point cloud was obtained containing all the points where the laser beam impacted. The fusion of LiDAR impacts and the positions of each associated to the RTK-GPS allowed the creation of the 3D structure. Although point clouds were already filtered, discarding points out of the study area, the branch volume cannot be directly calculated, since it turns into a 3D solid cluster that encloses a volume. To obtain the 3D object surface, and therefore to be able to calculate the volume enclosed by this surface, a suitable alpha shape was generated as an outline that envelops the outer points of the point cloud. The 3D scenes were obtained during the winter season when only branches were present and defoliated. The models were used to extract information related to height and branch volume. These models might be used for automatic pruning or relating this parameter to evaluate the future yield at each location. The 3D map was correlated with ground truth, which was manually determined, pruning the remaining weight. The number of scans by LiDAR influenced the relationship with the actual biomass measurements and had a significant effect on the treatments. A positive linear fit was obtained for the comparison between actual dry biomass and LiDAR volume. The influence of individual treatments was of low significance. The results showed strong correlations with actual values of biomass and volume with R2 = 0.75, and when comparing LiDAR scans with weight, the R2 rose up to 0.85. The obtained values show that this LiDAR technique is also valid for branch reconstruction with great advantages over other types of non-contact ranging sensors, regarding a high sampling resolution and high sampling rates. Even narrow branches were properly detected, which demonstrates the accuracy of the system working on difficult scenarios such as defoliated crops.
- Research Article
12
- 10.3390/rs15184529
- Sep 14, 2023
- Remote Sensing
Light detection and ranging (LiDAR) is a widely used technology for the acquisition of three-dimensional (3D) information about a wide variety of physical objects and environments. However, before conducting a campaign, a test is typically conducted to assess the potential of the utilized algorithm for information retrieval. It might not be a real campaign but rather a simulation to save time and costs. Here, a multi-platform LiDAR simulation model considering the location, direction, and wavelength of each emitted laser pulse was developed based on the large-scale remote sensing (RS) data and image simulation framework (LESS) model, which is a 3D radiative transfer model for simulating passive optical remote sensing signals using the ray tracing algorithm. The LESS LiDAR simulator took footprint size, returned energy, multiple scattering, and multispectrum LiDAR into account. The waveform and point similarity were assessed with the LiDAR module of the discrete anisotropic radiative transfer (DART) model. Abstract and realistic scenes were designed to assess the simulated LiDAR waveforms and point clouds. A waveform comparison in the abstract scene with the DART LiDAR module showed that the relative error was lower than 1%. In the realistic scene, airborne and terrestrial laser scanning were simulated by LESS and DART LiDAR modules. Their coefficients of determination ranged from 0.9108 to 0.9984. Their mean was 0.9698. The number of discrete returns fitted well and the coefficient of determination was 0.9986. A terrestrial point cloud comparison in the realistic scene showed that the coefficient of determination between the two sets of data could reach 0.9849. The performance of the LESS LiDAR simulator was also compared with the DART LiDAR module and HELIOS++. The results showed that the LESS LiDAR simulator is over three times faster than the DART LiDAR module and HELIOS++ when simulating terrestrial point clouds in a realistic scene. The proposed LiDAR simulator offers two modes for simulating point clouds: single-ray and multi-ray modes. The findings demonstrate that utilizing a single-ray simulation approach can significantly reduce the simulation time, by over 28 times, without substantially affecting the overall point number or ground pointswhen compared to employing multiple rays for simulations. This new LESS model integrating a LiDAR simulator has great potential in terms of simultaneously simulating LiDAR data and optical images based on the same 3D scene and parameters. As a proof of concept, the normalized difference vegetation index (NDVI) results from multispectral images and the vertical profiles from multispectral LiDAR waveforms were simulated and analyzed. The results showed that the proposed LESS LiDAR simulator can fulfill its design goals.
- Research Article
20
- 10.1016/j.compag.2022.107420
- Oct 14, 2022
- Computers and Electronics in Agriculture
Information fusion approach for biomass estimation in a plateau mountainous forest using a synergistic system comprising UAS-based digital camera and LiDAR
- Dissertation
- 10.32469/10355/107643
- Dec 1, 2024
The integration of Internet of Things (IoT) technology, which includes LiDAR sensors, is changing traffic operation, safety, and infrastructure assessment. LiDAR sensors, along with other IoT devices such as cameras and GPS units, are embedded within transportation infrastructures to provide real-time, precise data about traffic conditions and road safety. This wealth of data enables dynamic traffic control, facilitates the immediate identification of safety hazards, and supports proactive maintenance strategies for roads and bridges. Furthermore, the study focuses on leveraging machine learning to develop low-cost vision systems that can effectively utilize data streaming from infrastructure-mounted sensors to enhance traffic operations, safety and infrastructure condition monitoring. Immediate access to high-resolution traffic parameters from surveillance videos allows for a more dynamic approach to traffic control, incident detection, and congestion management. While cameras combined with CV techniques are vital in analyzing real-time traffic scenarios, another vision-based sensor, Light Detection and Ranging (LiDAR), with its precise laser scanning, also brings transformative advantages to transportation systems. LiDAR revolutionizes transportation systems by delivering higher accuracy and comprehensive coverage data for monitoring traffic complexities. However, the limitations including requirements for complex camera calibration and the incomplete point cloud information made the usage of these sensors costly and impeded its widespread adoption. Explored the advantages of both sensors, it is necessary to develop a low- cost vision-based system, that fuses the capabilities of both camera and LiDAR respectively. Such a system would combine the distinctive advantages of both sensors to surpass their constraints separately and would work as a tool to optimize the collection process and improve transportation safety. The author proposed three innovative objectives based on the previous research gaps. Firstly, the research investigates the distinct benefits of the camera and LiDAR in extracting high- resolution traffic data from a singular sensory input, and it features the automatic annotation of 3D bounding boxes in the LiDAR domain. Second, the study designed a framework that fuses camera and LiDAR technology, capable of executing real-time object recognition and depth perception simultaneously. Third, the study developed a machine-learning framework to assess transportation infrastructure utilizing a single image that in daytime lighting condition. The first object includes two aspects of both camera and LiDAR, the specific objectives are: 1). Implements a 2D homography technique for perspective transformation of traffic scenes to a bird-eye view which reduces the effect of partial occlusion and improves the accuracy of speed and acceleration data collection; CCTV cameras are automatically calibrated to convert pixel distances to real-world distances; Used the concept of spatial-temporal neighbors to improve the tracking results and thereby address the challenge posed by misassignment of vehicle trajectories. 2). We propose an innovative and effective framework for obtaining full point cloud representations of partially occluded objects using just a single LiDAR system, thus addressing the limitation of requiring multiple LiDAR units for comprehensive data acquisition; We introduce a novel framework for generating 3D bounding box annotations without relying on human annotators. This development is crucial in streamlining the labor-intensive and time-consuming process of annotating LiDAR data for traffic detection and classification models; The study pioneers the application of zero-shot learning techniques for vehicle and pedestrian detection, classification, and counting. 3). Develops a framework for spatial-temporal fusion of 360-degree video and LiDAR data. This framework enables us to associate each bounding box proposal in 2D video domain with actual distances or depths generated from mobile LiDAR data; Define a novel architecture that extends the output prediction vectors of Darknet-like backbone with information about depth. We also introduce a new lost that enables the network to simultaneously generate bounding box proposals and corresponding depth of each object in a single shot; Generated a large database of annotations for machine learning model development and comparative analysis. 4). Generate a big dataset of transportation infrastructure that collected from camera and LiDAR. Developed an innovative framework that integrates transportation infrastructure image and class information, with LiDAR-derived information, such as depth, and reflectivity intensity data through a novel Double U-Net architecture, enabling pixel-level reflectivity predictions for detailed infrastructure assessment.