Abstract

In visual tracking, the tracking model must be updated online, which often leads to undesired inclusion of corrupted training samples, and hence inducing tracking failure. We present a locality preserving correlation filter (LPCF) integrating a novel and generic decontamination approach, which mitigates the model drift problem. Our decontamination approach maintains the local neighborhood feature points structures of the bounding box center. This proposed tracking-result validation approach models not only the spatial neighborhood relationship but also the topological structures of the bounding box center. Additionally, a closed-form solution to our approach is derived, which makes the tracking-result validation process could be accomplished in only milliseconds. Moreover, a dimensionality reduction strategy is introduced to improve the real-time performance of our translation estimation component. Comprehensive experiments are performed on OTB-2015, LASOT, TrackingNet. The experimental results show that our decontamination approach remarkably improves the overall performance by 6.2%, 12.6%, and 3%, meanwhile, our complete algorithm improves the baseline by 27.8%, 34.8%, and 15%. Finally, our tracker achieves the best performance among most existing decontamination trackers under the real-time requirement.

Highlights

  • Visual tracking, in general, refers to the task of estimating locations and sizes of an arbitrary target in image sequences with only its initial states

  • Results on TrackingNet: We evaluate our tracker on TrackingNet and report results on Table 4

  • The results show that our tracker achieves a 52.4% AUC score, which is comparable with some deep learning-based trackers (ECO 55.4% and CFNet 57.8%)

Read more

Summary

Introduction

In general, refers to the task of estimating locations and sizes of an arbitrary target in image sequences with only its initial states. Deep learning-based methods [7,8,9,10,11] have dominated this filed and achieved very promising performances, as well as very fast speed (e.g., DaSiamRPN [12] 160FPS). Most deep learning-based methods rely on training on expensive GPUs with gigantic quantities of data. It is still challenging and meaningful to explore efficient non-deep-learning methods. Among most non-deep-learning methods, there exist two main methods to deal with visual tracking, namely generative and discriminative methods. Discriminative approaches [2,3,16] took tracking as differentiating the object from the Sensors 2020, 20, 6853; doi:10.3390/s20236853 www.mdpi.com/journal/sensors

Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.