Abstract
Correlation filter (CF) based tracking algorithms have shown excellent performance in comparison to most state-of-the-art algorithms on the object tracking benchmark (OTB). Nonetheless, most CF based tracking algorithms only consider limited single channel feature, and the tracking model always updated from frame-by-frame. It will generate some erroneous information when the target objects undergo sophisticated scenario changes, such as background clutter, occlusion, out-of-view, and so forth. Long-term accumulation of erroneous model updating will cause tracking drift. In order to address problems that are mentioned above, in this paper, we propose a robust multi-scale correlation filter tracking algorithm via self-adaptive fusion of multiple features. First, we fuse powerful multiple features including histogram of oriented gradients (HOG), color name (CN), and histogram of local intensities (HI) in the response layer. The weights assigned according to the proportion of response scores that are generated by each feature, which achieve self-adaptive fusion of multiple features for preferable feature representation. In the meantime the efficient model update strategy is proposed, which is performed by exploiting a pre-defined response threshold as discriminative condition for updating tracking model. In addition, we introduce an accurate multi-scale estimation method integrate with the model update strategy, which further improves the scale variation adaptability. Both qualitative and quantitative evaluations on challenging video sequences demonstrate that the proposed tracker performs superiorly against the state-of-the-art CF based methods.
Highlights
Visual tracking is one of the most important and active research issues in the field of computer vision, and it has been widely applied in a variety of applications such as video surveillance, autonomous driving, unmanned plane, human-computer interactions, robotics and so forth [1,2].The goal of the visual tracking is to estimate the target position of each frame with the target given in the initial frame, and predict the translation of each frame accurately in the subsequent image sequences
The correlation filter theory was the first time introduced into visual tracking by Bolme, the main task is performed by leaning the minimum output sum of squared error filter (MOSSE) based on single-channel grayscale image patches, which achieve a real-time tracking with the speed of 669 frames per second [14]
We propose a robust multi-scale correlation filter tracking algorithm via self-adaptive fusion of multiple hand-crafted features
Summary
Visual tracking is one of the most important and active research issues in the field of computer vision, and it has been widely applied in a variety of applications such as video surveillance, autonomous driving, unmanned plane, human-computer interactions, robotics and so forth [1,2]. There are a lot of classification approaches employ these general discriminative models, such as multiple instance learning (MIL) [4], support vector tracking [5], P-N learning [6], compressive sensing [7], on-line boosting [8], and all of the correlation filter based algorithms [13,14,15,16,17,18,19,20,21]. In order to address the problems mentioned above preferably, and to establish a desirable strategy to update the tracking model, we proposed a robust multi-scale correlation filter tracking algorithm based on self-adaptive fusion of multiple features, the main contributions of our work can be summarized as follows: 2.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.