Abstract

Efficient and accurate estimation of objects’ pose is essential in numerous practical applications. Due to the depth data contains abundant geometric information, some existing methods devote to extract features from 3D point cloud. However, these depth-based methods focus on extracting the point cloud local features and consider less about the global information. How to extract and utilize the local and global geometry features in depth information is crucial to achieve accurate predictions. To this end, we propose TransPose, a novel 6D pose framework that exploits Transformer Encoder with geometry-aware module to develop better learning of point cloud feature representations. To better extract local geometry features, we finely design the graph convolution network-based feature extractor that first uniformly sample point cloud and extract point pair features of point cloud. To further improve robustness to occlusion, we adopt Transformer to perform the propagation of global information, making each local feature obtains global information. Moreover, we introduce geometry-aware module in Transformer Encoder, which to form an effective constrain for point cloud feature learning and makes the global information exchange more tightly coupled with point cloud tasks. Extensive experiments indicate the effectiveness of TransPose, our pose estimation pipeline achieves competitive results on three benchmark datasets.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call