Abstract
The success of deep learning methods led to significant breakthroughs in 3-D point cloud processing tasks with applications in remote sensing. Existing methods utilize convolutions that have some limitations, as they assume a uniform input distribution and cannot learn long-range dependences. Recent works have shown that adding attention in conjunction with these methods improves performance. This raises a question: can attention layers completely replace convolutions? This letter proposes a fully attentional model—Point Transformer (PT) for deriving a rich point cloud representation. The model’s shape classification and retrieval performance are evaluated on a large-scale urban data set—RoofN3D and a standard benchmark data set ModelNet40. Extensive experiments are conducted to test the model’s robustness to unseen point corruptions for analyzing its effectiveness on real data sets. The proposed method outperforms other state-of-the-art models in the RoofN3D data set, gives competitive results in the ModelNet40 benchmark, and shows high robustness to various unseen point corruptions. Furthermore, the model is highly memory and space-efficient when compared to other methods.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.