Abstract

Point cloud data with high accuracy and high density is an important data source for the depiction of real ground objects, and there is a broad research prospect of using point cloud data directly for 3D object detection and recognition using deep learning methods. However, many deep learning models in previous research ignored the point cloud structure information and the sampling randomness. To overcome this limitation, we proposed an innovative 3D point cloud deep learning model, namely, the minimum bounding box over-segmentation–graph convolution 3D point cloud deep learning network model (MBBOS-GCN) for enhancing the structural information perception capability of the model and reduce the sampling randomness. In MBBOS-GCN, the number of points sampled is used as the scale, and a modified graph convolution model is used to collect point cloud structure information from different scales. The point cloud is divided into several small regions by the minimum bounding box algorithm, and the farthest point sampling (FPS) algorithm is used to sample within each small region to reduce sampling randomness. The experiments on object classification and semantic scene data segmentation show that: (1) the MBBOS-GCN model has high classification and segmentation accuracy, which is up to 91.87% and 89.5% on the ModelNet40 dataset and ScanNet dataset, respectively; (2) the MBBOS-GCN model is provided has good stability and robustness with a little change in accuracy under the altering density of input point cloud data, and slight classification loss value; (3) the MBBOS-GCN model can be adapted to real complex scenes when the classification accuracy reaches up to 97.53%. These superior performance of the MBBOS-GCN model can provide an effective support for the construction of digital twin city background data and the calibration of multimode satellite feature inversion algorithm validation.

Highlights

  • IntroductionComputer vision is able to perceive and recognize the world through acquiring information from sensors instead of human beings, which has been a research hotpot for a long time.[1,2,3,4] Target recognition is a fundamental and important research topic in the field of computer vision that is widely used in reverse engineering, intelligent surveillance, and remote sensing.[5,6,7,8] Compared with two-dimensional (2D) target recognition, how to recognize the position, shape, and pose of three-dimensional (3D) objects in space is more meaningful for practical application scenarios, which is in areas such as unmanned systems and augmented reality technology.9,10 3D target recognition can be divided into three types according to the data source: first, is based

  • For the single-scale graph convolutional deep learning network model, various numbers of sampling points result in different spatial resolutions of objects and different spatial structure information

  • In the object classification experiments, 512 sampling points are selected as the input data for the graph convolution part of the single-scale graph convolution model, and in the semantic segmentation experiments, 1024 sampling points are selected as the input data for the graph convolution part of the single-scale graph convolution model

Read more

Summary

Introduction

Computer vision is able to perceive and recognize the world through acquiring information from sensors instead of human beings, which has been a research hotpot for a long time.[1,2,3,4] Target recognition is a fundamental and important research topic in the field of computer vision that is widely used in reverse engineering, intelligent surveillance, and remote sensing.[5,6,7,8] Compared with two-dimensional (2D) target recognition, how to recognize the position, shape, and pose of three-dimensional (3D) objects in space is more meaningful for practical application scenarios, which is in areas such as unmanned systems and augmented reality technology.9,10 3D target recognition can be divided into three types according to the data source: first, is based. The third kind of target recognition methodology with 3D data has become a research hotspot in computer vision research.[19]

Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call