Abstract

As a low-level task of 3D perception, scene flow is a fundamental representation of dynamic scenes and provides non-rigid motion descriptions for the objects in the 3D environment, which can strongly support many upper-level applications. Inspired by the revolutionary success of deep learning, many attention-based neural networks have recently been proposed to estimate scene flow from consecutive point clouds. However, extracting effective features and estimating accurate point motions for irregular and occluded point clouds remains a challenging task. In this paper, we propose PT-FlowNet, the first end-to-end scene flow estimation network embedding the point transformer (PT) into all functional stages of the task. In particular, we design novel PT-based modules for point feature extraction, iterative flow update, and flow refinement stage to encourage effective point-level feature aggregation. Experimental results on FlyingThings3D and KITTI datasets show that our PT-FlowNet achieves state-of-the-art performance. Trained on synthetic data only, our PT-FlowNet can generalize to real-world scans and outperforms the existing methods by at least 36.2% for the EPE3D metric on the KITTI dataset. Code and models can be accessed at <uri xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">https://github.com/FuJingyun/PT-FlowNet</uri> .

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call