Towards Generic 3D Tracking in RGBD Videos: Benchmark and Baseline

Jinyu Yang,Aleš Leonardis,Zhongqun Zhang,Zhe Li,Feng Zheng,Hyung Jin Chang

doi:10.1007/978-3-031-20047-2_7

Abstract

AbstractTracking in 3D scenes is gaining momentum because of its numerous applications in robotics, autonomous driving, and scene understanding. Currently, 3D tracking is limited to specific model-based approaches involving point clouds, which impedes 3D trackers from applying in natural 3D scenes. RGBD sensors provide a more reasonable and acceptable solution for 3D object tracking due to their readily available synchronised color and depth information. Thus, in this paper, we investigate a novel problem: is it possible to track a generic (class-agnostic) 3D object in RGBD videos and predict 3D bounding boxes of the object of interest? To inspire research on this topic, we newly construct a standard benchmark for generic 3D object tracking, ‘Track-it-in-3D’, which contains 300 RGBD video sequences with dense 3D annotations and corresponding evaluation protocols. Furthermore, we propose an effective tracking baseline to estimate 3D bounding boxes for arbitrary objects in RGBD videos, by fusing appearance and spatial information effectively. Resources are available on https://github.com/yjybuaa/Track-it-in-3D.KeywordsObject tracking3D object trackingRGBD data

Full Text