Abstract Point cloud data, widely used in fields such as autonomous driving and robotic navigation, involves classification and segmentation tasks. The extraction of local and global features has become a major research focus. The paper proposes two modules: the Dual Pooling Attention model (DP-Attention) and the Residual Attention Module model (RA-MLP). DP-Attention utilizes max pooling and average pooling to compute attention, extracting information between points as well as between features to enhance local feature extraction. RA-MLP integrates self-attention and residual connections to improve global feature extraction. These modules are combined to construct the DPRA network, which is tailored for point cloud classification and segmentation tasks. The DPRA network architecture is based on the encoder-decoder structure of U-Net, using DP-Attention and RA-MLP in the encoder, and only RA-MLP in the decoder. Experimental results on three datasets demonstrate that the DPRA network achieves outstanding performance in the synthetic ModelNet40 classification experiment, with a mean class accuracy of 91.6%. It also achieves the highest mean class accuracy (85.3%) and overall accuracy (86.1%) in the real-world ScanObjectNN classification experiment. Additionally, it attains the highest mean Intersection over Union (mIoU) of 85.2% in the synthetic ShapeNet segmentation experiment. These results indicate that DPRA is applicable to various tasks, demonstrating strong generalization, robustness, and multi-task learning capabilities.
Read full abstract