Abstract

Learning-based 6-DOF (6D) pose tracking, serving as a basis for most real-time applications such as augmented reality and robot manipulation, receives attention transiting from 2D to 3D vision, with the popularity of depth sensors. However, the irregular nature of 3D point clouds challenges this task, especially since the lack of explicit alignments hinders the interaction and fusion between the observed point clouds. Therefore, this paper proposes a novel approach named PA-Pose to achieve 6D pose tracking in point clouds. It takes the forward-predicted dense correspondences within an overlap as reliable alignments, to guide the feature fusion of the partial-to-partial point clouds. Then, the relative transformation pose of adjacent observations is continuously regressed from the point-wisely fused features by confidence scoring, avoiding non-differentiable pose fitting. In addition, a shifted point convolution (SPConv) operation is introduced in the fusion process, to further promote the local context interaction of the observed point cloud pair in the expanded alignment field. Extensive experiments on two benchmark datasets (YCB-Video and YCBInEOAT) demonstrate that our method achieves state-of-the-art performance. Even though only 3D point clouds are taken as input, our PA-Pose is still competitive with those methods fully utilizing RGB-D information in the single view. Finally, experiments in the real scene for tracking industrial objects also validates the effectiveness of the proposed method.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call