Tracking the pose of a specific rigid object from monocular sequences is a basic problem in computer vision. State-of-the-art methods assume motion continuity between two consecutive frames. However, drastic relative motion causes large inter-frame pose shifts, especially in applications such as robotic grasping, failed satellite maintenance and space debris removal. Large pose shifts interrupt the inter-frame motion continuity leading to tracking failure. In this paper, we propose a robust and accurate monocular pose tracking method for tracking objects with large pose shifts. Using an indexable sparse viewpoint model to represent the object 3D geometry, we propose establishing a transitional view, which is searched for in an efficient variable-step way, to recover motion continuity. Then, a region-based optimization algorithm is adopted to optimize the pose based on the transitional view. Finally, we use a single-rendering-based pose refinement process to achieve highly accurate pose results. The experiments on the region-based object tracking (RBOT) dataset, the modified RBOT dataset, the synthetic large pose shift sequences and real sequences demonstrated that the proposed method achieved superior performance to the state-of-the-art methods in tracking objects with large pose shifts.
Read full abstract