Abstract

Discriminative correlation filters (DCF) with powerful feature descriptors have proven to be very effective for advanced visual object tracking approaches. However, due to the fixed capacity in achieving discriminative learning, existing DCF trackers perform the filter training on a single template extracted by convolutional neural networks (CNN) or hand-crafted descriptors. Such single template learning cannot provide powerful discriminative filters with guaranteed validity under appearance variation. To pinpoint the structural relevance of spatio-temporal appearance to the filtering system, we propose a new tracking algorithm that incorporates the construction of the Grassmannian manifold learning in the DCF formulation. Our method constructs the model appearance within an online updated affine subspace. It enables joint discriminative learning in the origin and basis of the subspace, achieving enhanced discrimination and interpretability of the learned filters. In addition, to improve tracking efficiency, we adaptively integrate online incremental learning to update the obtained manifold. To this end, specific spatio-temporal appearance patterns are dynamically learned during tracking, highlighting relevant variations and alleviating the performance degrading impact of less discriminative representations from a single template. The experimental results obtained on several well-known datasets, i.e., OTB2013, OTB2015, UAV123, and VOT2018, demonstrate the merits of the proposed method and its superiority over the state-of-the-art trackers.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call