Abstract

Visual tracking using multiple features has been proved as a robust approach because features could complement each other. Since different types of variations such as illumination, occlusion, and pose may occur in a video sequence, especially long sequence videos, how to properly select and fuse appropriate features has become one of the key problems in this approach. To address this issue, this paper proposes a new joint sparse representation model for robust feature-level fusion. The proposed method dynamically removes unreliable features to be fused for tracking by using the advantages of sparse representation. In order to capture the non-linear similarity of features, we extend the proposed method into a general kernelized framework, which is able to perform feature fusion on various kernel spaces. As a result, robust tracking performance is obtained. Both the qualitative and quantitative experimental results on publicly available videos show that the proposed method outperforms both sparse representation-based and fusion based-trackers.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.