Multiview 3D sensing and analysis for high quality point cloud reconstruction
Multiview 3D reconstruction techniques enable digital reconstruction of 3D objects from the real world by fusing different viewpoints of the same object into a single 3D representation. This process is by no means trivial and the acquisition of high quality point cloud representations of dynamic 3D objects is still an open problem. In this paper, an approach for high fidelity 3D point cloud generation using low cost 3D sensing hardware is presented. The proposed approach runs in an efficient low-cost hardware setting based on several Kinect v2 scanners connected to a single PC. It performs autocalibration and runs in real-time exploiting an efficient composition of several filtering methods including Radius Outlier Removal (ROR), Weighted Median filter (WM) and Weighted Inter-Frame Average filtering (WIFA). The performance of the proposed method has been demonstrated through efficient acquisition of dense 3D point clouds of moving objects.
- Research Article
1
- 10.5194/isprs-archives-xliii-b2-2022-251-2022
- May 30, 2022
- The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences
Abstract. 3D point clouds from terrestrial laser scanners (TLS) are used in a variety of fields and applications. To acquire high-quality point clouds that have enough point density, small scanning errors, and no lack of points in important regions, appropriate scan planning, including determination of scanner positions and scan conditions, is required. Currently, planning is supported by knowledge and experience of skilled workers, and it is difficult to ensure the quality of acquired point clouds. In this study, we propose a system for visualization of point clouds to support the acquisition of high-quality point clouds using TLS. The system allows the user to see and check the quality of scanned TLS point clouds and unscanned regions intuitively by superimposing the point clouds onto the real world using a mixed reality (MR) device. In addition, the system supports finding the next best scanner position for additional laser scans based on predicted scan quality visualization to acquire higher-quality points or fill the unscanned regions.
- Research Article
10
- 10.1007/s11852-013-0282-z
- Sep 5, 2013
- Journal of Coastal Conservation
Terrestrial laserscanning (TLS), also called ground-based LiDAR (Light Detection And Ranging) is a relatively new method which revolutionised geomorphological research in many domains. However, detailed studies of tidal flats by TLS have not been described in the literature yet. This study aims to fill this methodological gap by the application of TLS at two different locations on the coast of Jiangsu Province, Eastern China, and an assessment of the usability of this method for geomorphological research in such environments. The acquired point clouds are first processed to remove erroneous and noisy points. Subsequently, point clouds are computed to produce polygonal meshes and grid-based digital terrain model (DTM) more commonly used by the scientific community. The accuracy of the measurements is assessed by an analysis of elevation deviations for flat and horizontal concrete blocks. High quality point clouds with point densities of up to 4,000 points/m2 were acquired for a distance of up to 200 m. The data allowed for the detection of small landforms such as tidal channels, creeks and ripples in centimetre and decimetre scale. The point clouds had an average error of approximately 3 mm, however for some few points errors of up to 1.8 cm were detected. Based on the results it can be concluded that TLS can be a useful additional method for geomorphological research on tidal flats due to its ability to describe the landforms from high density point clouds. Repeated scanning could therefore provide data to quantitatively and qualitatively describe geomorphological changes over wider areas and thereby improve the understanding of sedimentation and erosion on tidal flats.
- Research Article
2
- 10.5194/isprs-archives-xli-b3-163-2016
- Jun 9, 2016
- The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences
Abstract. Photogrammetric processing algorithms can suffer problems due to either the initial image quality (noise, low radiometric quality, shadows and so on) or to certain surface materials (shiny or textureless objects). This can result in noisy point clouds and/or difficulties in feature extraction. Specifically, dense point clouds which are generated with photogrammetric method using a lightweight thermal camera, are more noisy and sparse than the point clouds of high-resolution digital camera images. In this paper, new method which produces more reliable and dense thermal point cloud using the sparse thermal point cloud and high resolution digital point cloud was considered. Both thermal and digital images were obtained with UAS (Unmanned Aerial System) based lightweight Optris PI 450 and Canon EOS 605D camera images. Thermal and digital point clouds, and orthophotos were produced using photogrammetric methods. Problematic thermal point cloud was transformed to a high density thermal point cloud using image processing methods such as rasterizing, registering, interpolation and filling. The results showed that the obtained thermal point cloud - up to chosen processing parameters - was 87% more densify than the original point cloud. The second improvement was gained at the height accuracy of the thermal point cloud. New densified point cloud has more consistent elevation model while the original thermal point cloud shows serious deviations from the expected surface model.
- Research Article
2
- 10.5194/isprsarchives-xli-b3-163-2016
- Jun 9, 2016
- ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences
Photogrammetric processing algorithms can suffer problems due to either the initial image quality (noise, low radiometric quality, shadows and so on) or to certain surface materials (shiny or textureless objects). This can result in noisy point clouds and/or difficulties in feature extraction. Specifically, dense point clouds which are generated with photogrammetric method using a lightweight thermal camera, are more noisy and sparse than the point clouds of high-resolution digital camera images. In this paper, new method which produces more reliable and dense thermal point cloud using the sparse thermal point cloud and high resolution digital point cloud was considered. Both thermal and digital images were obtained with UAS (Unmanned Aerial System) based lightweight Optris PI 450 and Canon EOS 605D camera images. Thermal and digital point clouds, and orthophotos were produced using photogrammetric methods. Problematic thermal point cloud was transformed to a high density thermal point cloud using image processing methods such as rasterizing, registering, interpolation and filling. The results showed that the obtained thermal point cloud - up to chosen processing parameters - was 87% more densify than the original point cloud. The second improvement was gained at the height accuracy of the thermal point cloud. New densified point cloud has more consistent elevation model while the original thermal point cloud shows serious deviations from the expected surface model.
- Research Article
- 10.3390/s26020476
- Jan 11, 2026
- Sensors (Basel, Switzerland)
This paper proposes an optical three-dimensional (3D) point cloud acquisition and sketching system, which is not limited by the measurement size, unlike traditional 3D object measurement techniques. The system employs an optical displacement sensor for surface displacement scanning and a six-axis inertial sensor (accelerometer and gyroscope) for spatial attitude perception. A microprocessor control unit (MCU) is responsible for acquiring, merging, and calculating data from the sensors, converting it into 3D point clouds. Butterworth filtering and Mahoney complementary filtering are used for sensor signal preprocessing and calculation, respectively. Furthermore, a human–machine interface is designed to visualize the point cloud and display the scanning path and measurement trajectory in real time. Compared to existing works in the literature, this system has a simpler hardware architecture, more efficient algorithms, and better operation, inspection, and observation features. The experimental results show that the maximum measurement error on 2D planes is 4.7% with a root mean square (RMS) error of 2.1%, corresponding to the reference length of 10.3 cm. For 3D objects, the maximum measurement error is 5.3% with the RMS error of 2.4%, corresponding to the reference length of 9.3 cm. Finally, it was verified that this system can also be applied to large-sized 3D objects for outlines.
- Research Article
4
- 10.3390/rs16234513
- Dec 1, 2024
- Remote Sensing
In recent years, due to the significant advancements in hardware sensors and software technologies, 3D environmental point cloud modeling has gradually been applied in the automation industry, autonomous vehicles, and construction engineering. With the high-precision measurements of 3D LiDAR, its point clouds can clearly reflect the geometric structure and features of the environment, thus enabling the creation of high-density 3D environmental point cloud models. However, due to the enormous quantity of high-density 3D point clouds, storing and processing these 3D data requires a considerable amount of memory and computing time. In light of this, this paper proposes a real-time 3D point cloud environmental contour modeling technique. The study uses the point cloud distribution from the 3D LiDAR body frame point cloud to establish structured edge features, thereby creating a 3D environmental contour point cloud map. Additionally, unstable objects such as vehicles will appear during the mapping process; these specific objects will be regarded as not part of the stable environmental model in this study. To address this issue, the study will further remove these objects from the 3D point cloud through image recognition and LiDAR heterogeneous matching, resulting in a higher quality 3D environmental contour point cloud map. This 3D environmental contour point cloud not only retains the recognizability of the environmental structure but also solves the problems of massive data storage and processing. Moreover, the method proposed in this study can achieve real-time realization without requiring the 3D point cloud to be organized in a structured order, making it applicable to unorganized 3D point cloud LiDAR sensors. Finally, the feasibility of the proposed method in practical applications is also verified through actual experimental data.
- Research Article
6
- 10.5194/essd-16-5767-2024
- Dec 19, 2024
- Earth System Science Data
Abstract. Permafrost landscapes in the Arctic are highly vulnerable to warming, with rapid changes underway. High-resolution remote sensing, especially aerial datasets, offers valuable insights into current permafrost characteristics and thaw dynamics. Here, we present a new dataset of very high resolution orthomosaics, point clouds, and digital surface models that we acquired over permafrost landscapes in northwestern Canada and northern and northwestern Alaska for the purpose of better understanding the impacts of climate change on permafrost landscapes. The imagery was collected with the Modular Aerial Camera System (MACS) during aerial campaigns conducted by the Alfred Wegener Institute in the summers of 2018, 2019, and 2021. The MACS was specifically developed by the German Aerospace Center (DLR) for operation under challenging light conditions in polar environments. It features cameras in the optical and the near-infrared wavelengths with up to a 16 MP resolution. We processed the images to four-band (blue–green–red–near-infrared) orthomosaics and digital surface models with spatial resolutions of 7 to 20 cm as well as 3D point clouds with point densities of up to 41 points m−2. The dataset collection features 102 subprojects from 35 target regions (1.4–161.1 km2 in size). Project sizes range from 4.8 to 336 GB. In total, 3.17 TB were published. The horizontal precision of the datasets is in the range of 1–2 px and vertical precision is better than 0.10 m. The datasets are not radiometrically calibrated. Overall, these very high resolution images and point clouds provide significant opportunities for mapping permafrost landforms and generating detailed training datasets for machine learning, can serve as a baseline for change detection for thermokarst and thermo-erosion processes, and help with upscaling of field measurements to lower-resolution satellite observations. The dataset is available on the PANGAEA repository at https://doi.org/10.1594/PANGAEA.961577 (Rettelbach et al., 2024).
- Conference Article
537
- 10.1109/cvpr.2019.00047
- Jun 1, 2019
3D point cloud generation is of great use for 3D scene modeling and understanding. Real-world 3D object point clouds can be properly described by a collection of low-level and high-level structures such as surfaces, geometric primitives, semantic parts,etc. In fact, there exist many different representations of a 3D object point cloud as a set of point groups. Existing frameworks for point cloud genera-ion either do not consider structure in their proposed solutions, or assume and enforce a specific structure/topology,e.g. a collection of manifolds or surfaces, for the generated point cloud of a 3D object. In this work, we pro-pose a novel decoder that generates a structured point cloud without assuming any specific structure or topology on the underlying point set. Our decoder is softly constrained to generate a point cloud following a hierarchical rooted tree structure. We show that given enough capacity and allowing for redundancies, the proposed decoder is very flexible and able to learn any arbitrary grouping of points including any topology on the point set. We evaluate our decoder on the task of point cloud generation for 3D point cloud shape completion. Combined with encoders from existing frameworks, we show that our proposed decoder significantly outperforms state-of-the-art 3D point cloud completion methods on the Shapenet dataset
- Book Chapter
9
- 10.1007/978-3-030-36711-4_20
- Jan 1, 2019
We propose a novel concept to directly match feature descriptors extracted from RGB images, with feature descriptors extracted from 3D point clouds. We use this concept to localize the position and orientation (pose) of the camera of a query image in dense point clouds. We generate a dataset of matching 2D and 3D descriptors, and use it to train a proposed Descriptor-Matcher algorithm. To localize a query image in a point cloud, we extract 2D key-points and descriptors from the query image. Then the Descriptor-Matcher is used to find the corresponding pairs 2D and 3D key-points by matching the 2D descriptors with the pre-extracted 3D descriptors of the point cloud. This information is used in a robust pose estimation algorithm to localize the query image in the 3D point cloud. Experiments demonstrate that directly matching 2D and 3D descriptors is not only a viable idea but can also be used for camera pose localization in dense 3D point clouds with high accuracy.
- Research Article
24
- 10.1016/j.cmpb.2021.106077
- Apr 3, 2021
- Computer Methods and Programs in Biomedicine
Recovering dense 3D point clouds from single endoscopic image
- Research Article
12
- 10.1049/cvi2.12136
- Aug 27, 2022
- IET Computer Vision
Deep learning‐based single view 3D reconstruction is a hot topic in computer vision. However, predicting a more realistic 3D point cloud from a single image is an ill‐posed problem. In recent years, most of the 3D point cloud prediction methods based on single view are straight‐through structure, which will cause the loss of part of feature information and the loss of part of detail information of the resulting point clouds, which will lead to the unsatisfactory visual effect of reconstructed point clouds. In this paper, a Feature‐Enhanced 3D point clouds generation Network (3D‐FENet) from a single image is proposed. In order to enhance the feature information of RGB image, edge extraction module is adopted. In the process of point cloud generation, a point cloud pyramid is designed, which combines low resolution point cloud with high resolution point cloud to enhance the local details of the generated point clouds. In the fine‐tuning stage, the differential projection module is used to fine‐tune the whole network by 2D projection of reconstructed point clouds. Experimental results show that the performance of the authors’ proposed method is better than the state‐of‐the‐art studies.
- Research Article
2
- 10.25972/opus-14449
- Jan 1, 2017
- Online Publication Service of Würzburg University (Würzburg University)
3D point clouds are a de facto standard for 3D documentation and modelling. The advances in laser scanning technology broadens the usability and access to 3D measurement systems. 3D point clouds are used in many disciplines such as robotics, 3D modelling, archeology and surveying. Scanners are able to acquire up to a million of points per second to represent the environment with a dense point cloud. This represents the captured environment with a very high degree of detail. The combination of laser scanning technology with photography adds color information to the point clouds. Thus the environment is represented more realistically. Full 3D models of environments, without any occlusion, require multiple scans. Merging point clouds is a challenging process. This thesis presents methods for point cloud registration based on the panorama images generated from the scans. Image representation of point clouds introduces 2D image processing methods to 3D point clouds. Several projection methods for the generation of panorama maps of point clouds are presented in this thesis. Additionally, methods for point cloud reduction and compression based on the panorama maps are proposed. Due to the large amounts of data generated from the 3D measurement systems these methods are necessary to improve the point cloud processing, transmission and archiving. This thesis introduces point cloud processing methods as a novel framework for the digitisation of archeological excavations. The framework replaces the conventional documentation methods for excavation sites. It employs point clouds for the generation of the digital documentation of an excavation with the help of an archeologist on-site. The 3D point cloud is used not only for data representation but also for analysis and knowledge generation. Finally, this thesis presents an autonomous indoor mobile mapping system. The mapping system focuses on the sensor placement planning method. Capturing a complete environment requires several scans. The sensor placement planning method solves for the minimum required scans to digitise large environments. Combining this method with a navigation system on a mobile robot platform enables it to acquire data fully autonomously. This thesis introduces a novel hole detection method for point clouds to detect obscured parts of a captured environment. The sensor placement planning method selects the next scan position with the most coverage of the obscured environment. This reduces the required number of scans. The navigation system on the robot platform consist of path planning, path following and obstacle avoidance. This guarantees the safe navigation of the mobile robot platform between the scan positions. The sensor placement planning method is designed as a stand alone process that could be used with a mobile robot platform for autonomous mapping of an environment or as an assistant tool for the surveyor on scanning projects.
- Dissertation
2
- 10.32657/10356/172100
- Jan 1, 2023
3D point cloud semantic segmentation is a fundamental scene understanding task. Typical 3D point cloud semantic segmentation approaches analyze the 3D information of LiDAR point clouds and predict the classes of every point in the point cloud scenes. However, existing 3D-based approaches still cannot fulfil the requirements of real-world applications in terms of accuracy. Considering that environments in the wild are dynamic, temporal information is an important clue for identifying dynamic objects and can potentially enhance 3D segmentation models. Therefore, 4D point cloud semantic segmentation is proposed to fully use the temporal and spatial information from 4D point clouds to enhance the performance of existing 3D works. On the other hand, training an effective 4D or 3D segmentation model requires a huge amount of data while manual annotations of point clouds are expensive. Weakly supervised segmentation approaches on 4D point clouds are able to train the segmentation model with minimum annotation requirements. In this thesis, we study fully supervised 4D point cloud semantic segmentation and further weakly supervised segmentation methods on 4D point clouds. 4D point cloud segmentation recognizes the labels of every point in 3D point cloud sequences or 4D point clouds. The temporal information in 4D point clouds is crucial for robotic systems to recognize dynamic objects. However, the area of 4D point cloud segmentation is still under-investigated, and existing approaches suffer from low efficiency and performance. To address this problem, we propose a novel framework called SpSequenceNet that fuses information from previous frames to the target frame. This framework is presented in Chapter 3. The network is designed based on 3D sparse convolution and includes two novel modules: Cross-frame Global Attention (CGA) module and Cross-frame Local Interpolation module (CLI). These modules capture spatial and temporal information from previous frames to enhance the predictions of the current frame. CGA selects the important features in the target features with a global summary of the previous feature. CLI interpolates the features of local regions in the previous frame and enhances the features in the target frame. We observe that the overall improvement of SpSequenceNet is still not satisfactory. In Chapter 4, we extend SpSequenceNet and enrol more information, i.e., the temporal variation information and the point-level detail information. Based on CLI, we design a temporal variation-aware interpolation to improve the performance of high-speed object segmentation. We also design a temporal voxel-point refinement module to refine the predictions with point-level information. Furthermore, in Chapter 5, we propose a novel module, FeatProp, to capture more temporal information. To this end, we design three novel approaches to enhance the features of target frames by extracting different temporal information in the local regions and global regions. Experimental results demonstrate that our frameworks achieve superior performance in 4D semantic segmentation. For weakly supervised segmentation on 4D point clouds, we first propose a new weakly training task with 0.001% initial annotations. This task is introduced in Chapter 6. Specifically, we divide 4D point cloud datasets into a series of 100-frame sequences. Then, we sample around 0.1\% of the points in the first frame of each sequence and annotate these points as initial annotations. In such a weak setting, the usage of huge amounts of unannotated frames is the core problem of approaching effective models. Hence, we propose a novel temporal-spatial framework called W4DTS to utilise the annotated frames for generating high-quality pseudo-labels in the unannotated frames. We train our models with the generated pseudo-labels. In W4DTS, we propose a temporal matching module to select the most confident points as the pseudo annotated points. We further use a spatial graph propagation module to propagate the label information of initial annotations and pseudo annotated points to the relevant point cloud frames and generate more pseudo labels. However, we observe that global label propagation tends to propagate noises and errors easily. What makes things worse is that those errors also generate more false pseudo labels in the next frame through the temporal matching module. In Chapter 7, we propose a novel approach, Progressive 4D Grouping (P4G), to improve the final model with higher pseudo label quality. P4G groups annotated and high confident unannotated points in each 3D point cloud sequence and generates high-quality pseudo labels with very sparse annotated points. To further improve our progressive 4D grouping approach, we design cross-frame contrastive learning and local consistency learning to enhance the quality of our 4D grouping. Our experimental results show that P4G achieves state-of-the-art performance.
- Research Article
32
- 10.1016/j.eswa.2023.120730
- Jun 10, 2023
- Expert Systems with Applications
Extracting cow point clouds from multi-view RGB images with an improved YOLACT++ instance segmentation
- Conference Article
7
- 10.1109/mmsp48831.2020.9287165
- Sep 21, 2020
With the rapid development of point cloud acquisition technologies, high-quality human-shape point clouds are more and more used in VR/AR applications and in general in 3D Graphics. To achieve near-realistic quality, such content usually contains an extremely high number of points (over 0.5 million points per 3D object per frame) and associated attributes (such as color). For this reason, disposing of efficient, dedicated 3D Point Cloud Compression (3DPCC) methods becomes mandatory. This requirement is even stronger in the case of dynamic content, where the coordinates and attributes of the 3D points are evolving over time. In this paper, we propose a novel skeleton-based 3DPCC approach, dedicated to the specific case of dynamic point clouds representing humanoid avatars. The method relies on a multi-view 2D human pose estimation of 3D dynamic point clouds. By using the DensePose neural network, we first extract the body parts from projected 2D images. The obtained 2D segmentation information is back-projected and aggregated into the 3D space. This procedure makes it possible to partition the 3D point cloud into a set of 3D body parts. For each part, a 3D affine transform is estimated between every two consecutive frames and used for 3D motion compensation. The proposed approach has been integrated into the Video-based Point Cloud Compression (V-PCC) test model of MPEG. Experimental results show that the proposed method, in the particular case of body motion with small amplitudes, outperforms the V-PCC test mode in the lossy inter-coding condition by up to 83% in terms of bitrate reduction in low bit rate conditions. Meanwhile, the proposed framework holds the potential of supporting various features such as regions of interests and level of details.