Abstract
3D Semantic segmentation is a key element for a variety of applications in robotics and autonomous vehicles. For such applications, 3D data are usually acquired by LiDAR sensors resulting in a point cloud, which is a set of points characterized by its unstructured form and inherent sparsity. For the task of 3D semantic segmentation where the corresponding point clouds should be labeled with semantics, the current tendency is the use of deep learning neural network architectures for effective representation learning. On the other hand, various 2D and 3D computer vision tasks have used attention mechanisms which result in an effective re-weighting of the already learned features. In this work, we aim to investigate the role of attention mechanisms for the task of 3D semantic segmentation for autonomous driving, by identifying the significance of different attention mechanisms when adopted in existing deep learning networks. Our study is further supported by an extensive experimentation on two standard datasets for autonomous driving, namely Street3D and SemanticKITTI, that permit to draw conclusions at both a quantitative and qualitative level. Our experimental findings show that there is a clear advantage when attention mechanisms have been adopted, resulting in a superior performance. In particular, we show that the adoption of a Point Transformer in a SPVCNN network, results in an architecture which outperforms the state of the art on the Street3D dataset.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have