Abstract
In visual tracking, the tracking model must be updated online, which often leads to undesired inclusion of corrupted training samples, and hence inducing tracking failure. We present a locality preserving correlation filter (LPCF) integrating a novel and generic decontamination approach, which mitigates the model drift problem. Our decontamination approach maintains the local neighborhood feature points structures of the bounding box center. This proposed tracking-result validation approach models not only the spatial neighborhood relationship but also the topological structures of the bounding box center. Additionally, a closed-form solution to our approach is derived, which makes the tracking-result validation process could be accomplished in only milliseconds. Moreover, a dimensionality reduction strategy is introduced to improve the real-time performance of our translation estimation component. Comprehensive experiments are performed on OTB-2015, LASOT, TrackingNet. The experimental results show that our decontamination approach remarkably improves the overall performance by 6.2%, 12.6%, and 3%, meanwhile, our complete algorithm improves the baseline by 27.8%, 34.8%, and 15%. Finally, our tracker achieves the best performance among most existing decontamination trackers under the real-time requirement.
Highlights
Visual tracking, in general, refers to the task of estimating locations and sizes of an arbitrary target in image sequences with only its initial states
Results on TrackingNet: We evaluate our tracker on TrackingNet and report results on Table 4
The results show that our tracker achieves a 52.4% AUC score, which is comparable with some deep learning-based trackers (ECO 55.4% and CFNet 57.8%)
Summary
In general, refers to the task of estimating locations and sizes of an arbitrary target in image sequences with only its initial states. Deep learning-based methods [7,8,9,10,11] have dominated this filed and achieved very promising performances, as well as very fast speed (e.g., DaSiamRPN [12] 160FPS). Most deep learning-based methods rely on training on expensive GPUs with gigantic quantities of data. It is still challenging and meaningful to explore efficient non-deep-learning methods. Among most non-deep-learning methods, there exist two main methods to deal with visual tracking, namely generative and discriminative methods. Discriminative approaches [2,3,16] took tracking as differentiating the object from the Sensors 2020, 20, 6853; doi:10.3390/s20236853 www.mdpi.com/journal/sensors
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.