Low-Latency LiDAR Semantic Segmentation
Several methods of semantic segmentation using light detection and ranging (LiDAR) sensors have been proposed for the recognition of surrounding objects by autonomous driving cars. LiDAR is a sensor that compensates for the weaknesses of other sensors, such as cameras or radar systems, and semantic segmentation assigns a class label to each point in the LiDAR point cloud. Recently, real-time semantic segmentation methods that are capable of processing LiDAR point clouds at frame rates have been proposed. Real-time semantic segmentation is essential for the autonomous driving system because it can output class labels for LiDAR point clouds at high speeds. However, this segmentation method suffers from a delay equal to processing time. To address this challenge, we propose a novel method that combines SalsaNext [1], a method of real-time LiDAR semantic segmentation, and semantic forecasting, which predicts the results of future semantic segmentation. We quantitatively evaluate our method using the Semantic-KITTI dataset, which comprises point cloud data acquired from the LiDAR sensor in the real world, and compare the latency and accuracy of our method with other semantic segmentation methods. Consequently, our method is found to be capable of operating in real-time and with low-latency, and it can achieve a performance similar to that of previously reported real-time semantic segmentation methods.
- Research Article
2
- 10.6574/jprs.2014.19(1).4
- Nov 1, 2014
LiDAR (Light Detection and Ranging) point clouds are measurements of irregularly distributed points on scanned object surfaces acquired with airborne or terrestrial LiDAR systems. Feature extraction is the key to transform LiDAR data into spatial information. Surface features are dominant in most LiDAR data corresponding to scanned object surfaces. This paper proposes a general method to segment co-surface points. An incremental segmentation strategy is developed for the implementation, which comprises several algorithms and employs various criteria to gradually segment LiDAR point clouds into several levels. There are four operation steps. First, the proximity of point clouds is established as spatial indices defined in an octree-structured voxel space. Second, a connected-component labeling algorithm for voxels is applied for segmenting neighboring points. Third, coplanar points then can be segmented with the octree-based split-and-merge algorithm as plane features. Finally, combining neighboring plane features forms surface features. With respect to each step, processed LiDAR point clouds are segmented into organized points, neighboring point groups, coplanar point groups, and co-surface point groups. The proposed method enables an incremental retrieval and analysis of a large LiDAR dataset. Experiment results demonstrate the effectiveness of the segmentation algorithm in handling both airborne and terrestrial LiDAR data. The end results as well as the intermediate results of the segmentation may be useful for object modeling of different purposes using LiDAR data.
- Research Article
18
- 10.1109/tgrs.2023.3264102
- Jan 1, 2023
- IEEE Transactions on Geoscience and Remote Sensing
Sequential point clouds acquired by light detection and ranging (LiDAR) technology provide accurate spatial information for environmental sensing. However, semantic segmentation of point cloud sequences relies on many manual point-wise annotations, which are error-prone and expensive. Existing mainstream weakly supervised methods tackle this by reducing the percentage of labeled points, but they are mostly designed for static indoor scenes and are hard to apply practically. From the viewpoint of realistic annotation procedures and the nature of point cloud sequences, this paper proposes a novel semantic segmentation method, SemanticFlow, for LiDAR point cloud sequences using sparse frames with annotations. The proposed method achieves competitive performance compared with fully supervised methods. Specifically, we designed a bidirectional cross-frame pseudo label propagation module that uses scene flow to learn the correlation and propagate pseudo labels across neighboring frames. In addition, a label refinement mechanism is proposed to select reliable pseudo labels for learning. Extensive experiments on SemanticKITTI, SemanticPOSS, and Synthia 4D datasets demonstrate that our sparse frame annotation method is compatible with some fully supervised counterparts.
- Research Article
46
- 10.1080/02533839.2008.9671456
- Sep 1, 2008
- Journal of the Chinese Institute of Engineers
Techniques for extracting data from LiDAR point clouds can be refined for increased accuracy. In this paper, the authors elaborate on an innovative approach for registering ground‐based LiDAR point clouds using overlapping scans based on 3D line features. The proposed working scheme consists of three major kernels: a 3D line feature extractor, a 3D line feature matching mechanism, and a mathematical model for simultaneously registering ground‐based LiDAR point clouds of multi‐scans on a 3D line feature basis. All processing chains in this study are featured efficiently and come close to meeting the needs of practical usage. Experiments conducted show the proposed method of employing 3D line features to be a useful alternative or complement to point, surface and other features for LiDAR (Light Detection And Ranging) point clouds registration. It is especially effective in areas rich in man‐made structures.
- Research Article
2
- 10.3389/fphy.2025.1548786
- Jan 30, 2025
- Frontiers in Physics
LiDAR (Light Detection and Ranging) is an essential device for capturing the depth information of objects. Unmanned aerial vehicles (UAV) can sense the surrounding environment through LiDAR and image sensors to make autonomous flight decisions. In this process, aerial slender targets, such as overhead power lines, pose a threat to the flight safety of UAVs. These targets have complex backgrounds, elongated shapes, and small reflection cross-sections, making them difficult to detect directly from LiDAR point clouds. To address this issue, this paper takes overhead power line as a representative example of aerial slender targets and proposes a method that utilizes visible light images to guide the segmentation of LiDAR point clouds under large depth of field conditions. The method introduces an image segmentation algorithm based on a voting mechanism for overhead power lines and designs a calibration algorithm for LiDAR point clouds and images in the scenarios with large depth of field. Experimental results demonstrate that in various complex scenes, this method can segment the LiDAR point clouds of overhead power lines, thereby achieving accurate positions and exhibiting good adaptability across multiple scenes. Compared to traditional point cloud segmentation methods, the segmentation accuracy of the proposed method is significantly improved, promoting the practical application of LiDAR.
- Research Article
1
- 10.4467/21995923gp.24.006.20473
- Jan 1, 2024
- Geoinformatica Polonica
The Crown of Polish Mountains is a list of mountain peaks that has long attracted significant interest, with all included summits being considered worthy conquering. The proposal to expand this list with additional peaks, termed the “New Crown of Polish Mountains” by historian Krzysztof Bzowski, served as the impetus for a study of examining the accuracy of LiDAR (Light Detection and Ranging) point clouds in the areas of the newly proposed peaks. The primary data source analyzed in this study is the LiDAR point cloud with a density of 4 points per square meter, obtained from the ISOK project. As a secondary LiDAR data source, a self-generated point cloud was utilized, created by using the integrated LiDAR sensor in the iPhone 13 Pro and the free 3dScannerApp mobile application within terrestrial scanning. These datasets were compared against RTK GNSS measurements obtained with a Leica GS16 receiver and mobile measurements conducted using Android smartphones. In addition to analyzing the raw point clouds, the study also involved the visualization of the analyzed areas by the creation of Digital Terrain Models in two software programs: ArcGIS Pro and QGIS Desktop. The research confirmed the known accuracy of ALS point clouds and revealed that the integrated LiDAR sensor in the iPhone 13 Pro demonstrates surprising accuracy. The potential for laser scanning with a smartphone, combined with the capability of conducting mobile GNSS measurements, could revolutionize geodetic surveying and simplify the acquisition of point cloud data.
- Conference Article
- 10.4271/2020-01-0703
- Apr 14, 2020
- SAE technical papers on CD-ROM/SAE technical paper series
<div class="section abstract"><div class="htmlview paragraph">A Light Detection And Ranging (LiDAR) is now becoming an essential sensor for an autonomous vehicle. The LiDAR provides the surrounding environment information of the vehicle in the form of a point cloud. A decision-making system of the autonomous car is able to determine a safe and comfort maneuver by utilizing the detected LiDAR point cloud. The LiDAR points on the cloud are classified as dynamic or static class depending on the movement of the object being detected. If the movement class (dynamic or static) of detected points can be provided by LiDAR, the decision-making system is able to plan the appropriate motion of the autonomous vehicle according to the movement of the object. This paper proposes a real-time process to segment the motion states of LiDAR points. The basic principle of the classification algorithm is to classify the point-wise movement of a target point cloud through the other point clouds and sensor poses. First, a fixed size buffer store the LiDAR point clouds and sensor poses for a constant time window. Second, motion beliefs of the target point cloud against other point clouds and sensor poses in the buffer are estimated, respectively. Each motion belief of the points in the target point cloud is represented by a series of masses of dynamic, static, and unknown based on the evidence theory. Finally, the series of motion belief masses of the target point cloud for the other point clouds and poses are integrated through the Dempster-Shafer combination. The integrated mass value is used to classify the point-wise motion of the target point cloud into the state of dynamic, static, and unknown. The proposed algorithm was quantitatively evaluated through the simulation of LiDAR sensors and surrounding environment. Then, the algorithm was qualitatively validated through the experiments using an autonomous car equipped with LiDAR. The autonomous vehicle was able to perform the 3D point cloud mapping and map-matching localization.</div></div>
- Research Article
53
- 10.3390/rs13112062
- May 24, 2021
- Remote Sensing
Estimation of urban tree canopy parameters plays a crucial role in urban forest management. Unmanned aerial vehicles (UAV) have been widely used for many applications particularly forestry mapping. UAV-derived images, captured by an onboard camera, provide a means to produce 3D point clouds using photogrammetric mapping. Similarly, small UAV mounted light detection and ranging (LiDAR) sensors can also provide very dense 3D point clouds. While point clouds derived from both photogrammetric and LiDAR sensors can allow the accurate estimation of critical tree canopy parameters, so far a comparison of both techniques is missing. Point clouds derived from these sources vary according to differences in data collection and processing, a detailed comparison of point clouds in terms of accuracy and completeness, in relation to tree canopy parameters using point clouds is necessary. In this research, point clouds produced by UAV-photogrammetry and -LiDAR over an urban park along with the estimated tree canopy parameters are compared, and results are presented. The results show that UAV-photogrammetry and -LiDAR point clouds are highly correlated with R2 of 99.54% and the estimated tree canopy parameters are correlated with R2 of higher than 95%.
- Research Article
8
- 10.1002/arp.1869
- Jun 16, 2022
- Archaeological Prospection
Potential and limitations of LiDAR altimetry in archaeological survey. Copper Age and Bronze Age settlements in southern Iberia
- Research Article
12
- 10.3390/s22166210
- Aug 18, 2022
- Sensors (Basel, Switzerland)
Mobile light detection and ranging (LiDAR) sensor point clouds are used in many fields such as road network management, architecture and urban planning, and 3D High Definition (HD) city maps for autonomous vehicles. Semantic segmentation of mobile point clouds is critical for these tasks. In this study, we present a robust and effective deep learning-based point cloud semantic segmentation method. Semantic segmentation is applied to range images produced from point cloud with spherical projection. Irregular 3D mobile point clouds are transformed into regular form by projecting the clouds onto the plane to generate 2D representation of the point cloud. This representation is fed to the proposed network that produces semantic segmentation. The local geometric feature vector is calculated for each point. Optimum parameter experiments were also performed to obtain the best results for semantic segmentation. The proposed technique, called SegUNet3D, is an ensemble approach based on the combination of U-Net and SegNet algorithms. SegUNet3D algorithm has been compared with five different segmentation algorithms on two challenging datasets. SemanticPOSS dataset includes the urban area, whereas RELLIS-3D includes the off-road environment. As a result of the study, it was demonstrated that the proposed approach is superior to other methods in terms of mean Intersection over Union (mIoU) in both datasets. The proposed method was able to improve the mIoU metric by up to 15.9% in the SemanticPOSS dataset and up to 5.4% in the RELLIS-3D dataset.
- Conference Article
2
- 10.4271/2023-01-0740
- Apr 11, 2023
- SAE technical papers on CD-ROM/SAE technical paper series
<div class="section abstract"><div class="htmlview paragraph">Image segmentation has historically been a technique for analyzing terrain for military autonomous vehicles. One of the weaknesses of image segmentation from camera data is that it lacks depth information, and it can be affected by environment lighting. Light detection and ranging (LiDAR) is an emerging technology in image segmentation that is able to estimate distances to the objects it detects. One advantage of LiDAR is the ability to gather accurate distances regardless of day, night, shadows, or glare. This study examines LiDAR and camera image segmentation fusion to improve an advanced driver-assistance systems (ADAS) algorithm for off-road autonomous military vehicles. The volume of points generated by LiDAR provides the vehicle with distance and spatial data surrounding the vehicle. Processing these point clouds with semantic segmentation is a computationally intensive process requiring fusion of camera and LiDAR data so that the neural network can process depth and image data simultaneously. We create fused depth images by using a projection method from the LiDAR onto the images to create depth images (RGB-Depth). A neural network is trained to segment the fused data from RELLIS-3D, which is a multi-modal data set for off road robotics. This data set contains both LiDAR point clouds and corresponding RGB images for training the neural network. The labels from the data set are grouped as objects, traversable terrain, non-traversable terrain, and sky to balance underrepresented classes. Results on a modified version of DeepLabv3+ with a ResNet-18 backbone achieves an overall accuracy of 93.989 percent.</div></div>
- Research Article
6
- 10.1109/lgrs.2021.3099935
- Jan 1, 2022
- IEEE Geoscience and Remote Sensing Letters
Multiple spatial scales have been used extensively for feature extraction from light detection and ranging (LiDAR) point clouds. These features have been used for semantic classification, segmentation, and other data analysis methods. There is a gap in the adaptive methodology for the effective use of multiple scales here. This stems from determining the best strategy to aggregate the information or features gathered from different scales. The widely used multiscale method is feature extraction at an optimal scale, which is in itself an adaptive method. However, the success of identifying the optimal scale depends on the set of scales used in its determination, as it must include the scale where the global minimum of eigenentropy occurs. An alternative method is to average features across multiple scales, which works in specific scenarios. In order to improve the flexibility of using different methods in the same workflow, we propose an adaptive method for the selection of multiscale feature extraction for semantic classification of LiDAR point clouds, with a focus on airborne laser scans. Our decision-making process for finding the best multiscale method exploits spatial locality of the features. We show how such a control strategy can be implemented in an Apache Spark–Cassandra distributed system for processing large-scale point clouds using voxelization for preserving spatial locality, and binomial logistic regression for selecting voxels to implement a specific multiscale method at. Our results show significant improvement in classification accuracy in the Dayton Annotated Laser Earth Scan (DALES) data, implemented using Spark MLlib in our distributed system.
- Research Article
33
- 10.1109/access.2021.3102632
- Jan 1, 2021
- IEEE Access
Point clouds derived from LiDAR (Light Detection and Ranging) and photogrammetry systems are used to extract building footprints in dense urban areas. Two extraction methods based on DSM (Digital Surface Model) images and point clouds are comprehensively evaluated and compared. Firstly, photogrammetric point clouds are generated from aerial images of downtown Guangzhou, China, and compared with corresponding LiDAR point clouds. Then, DSM images are created using these point clouds and a threshold segmentation method is applied for building extraction. Although regularized buildings can be extracted according to the selection of appropriate height thresholds for the LiDAR DSM and photogrammetric DSM, blurry building boundaries exist for results of photogrammetric DSM when high trees are available nearby. LiDAR DSM extraction performs better in terms of Precision, Recall, and $F$ -score metrics. A DoN (Difference of Normals) approach based on point cloud datasets is also quantitatively and qualitatively demonstrated. Our experiments show that when a suitable radius threshold is selected, the method provides satisfactorily normal calculation results and can successfully isolate building roofs from other objects in densely built-up areas. The majority of building extraction results have a precision >0.9 and favorable Recall and $F$ -score results. There is high consistency between photogrammetric and LiDAR point clouds. Although LiDAR provides higher extraction accuracy, photogrammetry is also useful for its more convenient acquisition and higher point cloud densities.
- Research Article
26
- 10.1016/j.ecoinf.2022.101836
- Sep 28, 2022
- Ecological Informatics
Quantifying ecosystem structure is of key importance for ecology, conservation, restoration, and biodiversity monitoring because the diversity, geographic distribution and abundance of animals, plants and other organisms is tightly linked to the physical structure of vegetation and associated microclimates. Light Detection And Ranging (LiDAR) — an active remote sensing technique — can provide detailed and high resolution information on ecosystem structure because the laser pulse emitted from the sensor and its subsequent return signal from the vegetation (leaves, branches, stems) delivers three-dimensional point clouds from which metrics of vegetation structure (e.g. ecosystem height, cover, and structural complexity) can be derived. However, processing 3D LiDAR point clouds into geospatial data products of ecosystem structure remains challenging across broad spatial extents due to the large volume of national or regional point cloud datasets (typically multiple terabytes consisting of hundreds of billions of points). Here, we present a high-throughput workflow called ‘Laserfarm’ enabling the efficient, scalable and distributed processing of multi-terabyte LiDAR point clouds from national and regional airborne laser scanning (ALS) surveys into geospatial data products of ecosystem structure. Laserfarm is a free and open-source, end-to-end workflow which contains modular pipelines for the re-tiling, normalization, feature extraction and rasterization of point cloud information from ALS and other LiDAR surveys. The workflow is designed with horizontal scalability and can be deployed with distributed computing on different infrastructures, e.g. a cluster of virtual machines. We demonstrate the Laserfarm workflow by processing a country-wide multi-terabyte ALS dataset of the Netherlands (covering ∼34,000 km2 with ∼700 billion points and ∼ 16 TB uncompressed LiDAR point clouds) into 25 raster layers at 10 m resolution capturing ecosystem height, cover and structural complexity at a national extent. The Laserfarm workflow, implemented in Python and available as Jupyter Notebooks, is applicable to other LiDAR datasets and enables users to execute automated pipelines for generating consistent and reproducible geospatial data products of ecosystems structure from massive amounts of LiDAR point clouds on distributed computing infrastructures, including cloud computing environments. We provide information on workflow performance (including total CPU times, total wall-time estimates and average CPU times for single files and LiDAR metrics) and discuss how the Laserfarm workflow can be scaled to other LiDAR datasets and computing environments, including remote cloud infrastructures. The Laserfarm workflow allows a broad user community to process massive amounts of LiDAR point clouds for mapping vegetation structure, e.g. for applications in ecology, biodiversity monitoring and ecosystem restoration.
- Research Article
61
- 10.1007/s00138-017-0845-3
- May 29, 2017
- Machine Vision and Applications
3D urban maps with semantic labels and metric information are not only essential for the next generation robots such autonomous vehicles and city drones, but also help to visualize and augment local environment in mobile user applications. The machine vision challenge is to generate accurate urban maps from existing data with minimal manual annotation. In this work, we propose a novel methodology that takes GPS registered LiDAR (Light Detection And Ranging) point clouds and street view images as inputs and creates semantic labels for the 3D points clouds using a hybrid of rule-based parsing and learning-based labelling that combine point cloud and photometric features. The rule-based parsing boosts segmentation of simple and large structures such as street surfaces and building facades that span almost 75% of the point cloud data. For more complex structures, such as cars, trees and pedestrians, we adopt boosted decision trees that exploit both structure (LiDAR) and photometric (street view) features. We provide qualitative examples of our methodology in 3D visualization where we construct parametric graphical models from labelled data and in 2D image segmentation where 3D labels are back projected to the street view images. In quantitative evaluation we report classification accuracy and computing times and compare results to competing methods with three popular databases: NAVTEQ True, Paris-Rue-Madame and TLS (terrestrial laser scanned) Velodyne.
- Conference Article
- 10.1109/icpads.2012.143
- Dec 1, 2012
A typical LIDAR (Light Detection and Ranging) scan contains hundreds of millions of points. As such, the visualization of LIDAR point clouds poses a significant challenge in data analysis. One solution is to display LIDAR point clouds on a large display wall with an array of LCD monitors. This provides researchers with a high-resolution display environment for looking at and studying large datasets. In this paper, we present a case study that visualizes LIDAR point clouds on a tiled display wall termed HIPerDisplay (Highly Interactive Parallelized Display). It has twenty 24-inch LCDs with a total resolution of 46 megapixels. Interaction between the user and the display wall is achieved by using a video camera system that is able to track the position of a hand-held light ball device. A user holds it to manipulate point clouds on HIPerDisplay. Case studies are conducted to study the LIDAR scans of slopes in the Houshanyue mountain areas in Taiwan. Experiments were conducted to examine the advantages of using the HIPerDisplay for point clouds in data post-processing. The experiments assess two tasks for manipulating point cloud data designed to evaluate the efficiency of the interactive devices. To evaluate the efficiency of the system, a group of thirty graduate students participated in the experiment. User surveys were performed to evaluate the efficiency of the system and to discover the users' opinions about using the interactive device in a large display environment. The results showed that the participants preferred to perform LIDAR data operation tasks on a high-resolution large display environment rather than on a single monitor. The results also showed that HIPerDisplay offered superior performance for the processing of large LIDAR datasets.