Abstract

Tracking-by-detection framework has been normally adopted in visual tracking methods. It aims to localize the visual target object with a bounding box. However, the bounding box is usually difficult to describe the target object accurately and thus easily introduces noisy background information, which usually degrades the final tracking results. Recently, weighted patch representation of the object has been shown very effectively for suppressing the undesirable background information and thus can obviously improve the tracking results. In this paper, we propose a novel Spatial-Temporal Graph representation and Learning (STGL) model to generate a kind of robust target representation for visual tracking problem. The main aspect of STGL is that it aims to exploit both spatial (within each frame) and temporal (between consecutive frames) structure of patches simultaneously in a unified graph representation and semi-supervised learning model. Comparing with existing works, STGL naturally exploits the learned representation of object in previous frame and thus can obtain the representation of object in current frame more accurately and robustly. A new ADMM algorithm is derived to solve the proposed STGL model. Based on the proposed object representation, we then adapt the structured SVM by introducing scale estimation to achieve object tracking. Extensive experiments show that our method outperforms the state-of-the-art patch based tracking methods on two standard benchmark datasets.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call