Abstract In deep learning, point clouds are used as the primary input format for 3D data, which can provide detailed geometric information about objects in the original 3D space. PointNet++ is a deep learning network that uses point cloud data as an input format, which avoids the losses associated with the previous conversion of point cloud into 3D voxelization and a collection of 2D images. Although PointNet++ can directly process point cloud data in various ways, due to the disordered, irregular, and unevenly distributed nature of point cloud data, the effect of extracting point cloud features could be better. The large amount of point cloud data also leads to the training model falling into the local optimal solution, which affects the training results. In recent years, some effective methods and strategies have emerged to address these problems. In this thesis, three methods are proposed based on the PointNet++ network: feature similarity-based attention pooling, small kernel convolution, and diverse branch block method to improve the performance of the PointNet++ network. Experiments show that the improvement methods proposed in this paper effectively improve the feature extraction accuracy, which improves the accuracy of the PointNet++ network for classification on the ModelNet40_Normal_Resampled dataset, with an overall improvement of 1% compared with PointNet++.
Read full abstract