NGD-Transformer: Navigation Geodesic Distance Positional Encoding with Self-Attention Pooling for Graph Transformer on 3D Triangle Mesh

Jiafu Zhuang,Wei Zhuang,Xiaofeng Liu

doi:10.3390/sym14102050

Jiafu Zhuang, Wei Zhuang + Show 1 more

Open Access

PDF Available

https://doi.org/10.3390/sym14102050

Copy DOI

Export

Save

Cite

Journal: Symmetry	Publication Date: Oct 1, 2022
Citations: 1	License type: CC BY 4.0

Affiliation: Quanzhou Normal University, University of Manchester

Abstract
Full-Text PDF
Similar Papers

Abstract

Listen

Following the significant success of the transformer in NLP and computer vision, this paper attempts to extend it to 3D triangle mesh. The aim is to determine the shape’s global representation using the transformer and capture the inherent manifold information. To this end, this paper proposes a novel learning framework named Navigation Geodesic Distance Transformer (NGD-Transformer) for 3D mesh. Specifically, this approach combined farthest point sampling with the Voronoi segmentation algorithm to spawn uniform and non-overlapping manifold patches. However, the vertex number of these patches was inconsistent. Therefore, self-attention graph pooling is employed for sorting the vertices on each patch and screening out the most representative nodes, which were then reorganized according to their scores to generate tokens and their raw feature embeddings. To better exploit the manifold properties of the mesh, this paper further proposed a novel positional encoding called navigation geodesic distance positional encoding (NGD-PE), which encodes the geodesic distance between vertices relatively and spatial symmetrically. Subsequently, the raw feature embeddings and positional encodings were summed as input embeddings fed to the graph transformer encoder to determine the global representation of the shape. Experiments on several datasets were conducted, and the experimental results show the excellent performance of our proposed method.

Full Text