Abstract

The 3D point cloud, a common format of 3D data, finds extensive applications in fields like remote sensing, surveying, robotics, and more. Addressing the challenges posed by insufficient weighted feature information and the offset between weighted results and task expectations in the attention mechanism, this paper proposes a Multi-scale Learnable Key-channel Attention Network (MLKNet). First, we introduce a feature feedback-repair module to mitigate the impact of information loss in the feature embedding process. This module aims to fully embed the original data into a high-dimensional feature space, ensuring a rich supply of feature information for subsequent transformer modules. Second, an efficient hierarchical local feature encoder extract and aggregate local features from point clouds at various scales, thereby significantly enhancing the model's capability to represent geometric structures. Third, a novel learnable key-channel attention module allows tasks to influence the feature selection and weighting process directly, make the highlighted features as close to task expectations as possible, effectively enhancing the network's perception of global semantic information. Our method was benchmarked on various tasks where we achieved overall accuracy (OA) of 92.3% on the ModelNet40 classification task and achieved instance mean intersection over union (ins. mIoU) of 87.6% on the ShapeNet-part segmentation task. The results indicate the superior performance of our method.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call