Abstract

Although numerous trackers have been designed to adapt to the nonstationary image streams that change over time, it remains a challenging task to facilitate a tracker to accurately distinguish the target from the background in every frame. This paper proposes a robust superpixel-based tracker via depth fusion, which exploits the adequate structural information and great flexibility of mid-level features captured by superpixels, as well as the depth-map's discriminative ability for the target and background separation. By introducing graph-regularized sparse coding into the appearance model, the local geometrical structure of data is considered, and the resulting appearance model has a more powerful discriminative ability. Meanwhile, the similarity of the target superpixels' neighborhoods in two adjacent frames is also incorporated into the refinement of the target estimation, which helps a more accurate localization. Most importantly, the depth cue is fused into the superpixel-based target estimation so as to tackle the cluttered background with similar appearance to the target. To evaluate the effectiveness of the proposed tracker, four video sequences of different challenging situations are contributed by the authors. The comparison results demonstrate that the proposed tracker has more robust and accurate performance than seven ones representing the state-of-the-art.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call