Abstract
With the development of new methodologies for faster training on datasets, there is a need to provide an in-depth explanation of the workings of such methods. This paper attempts to provide an understanding for one such correlation filter-based tracking technology, Kernelized Correlation Filter (KCF), which uses implicit properties of tracked images (circulant matrices) for training and tracking in real-time. It is unlike deep learning, which is data intensive. KCF uses implicit dynamic properties of the scene and movements of image patches to form an efficient representation based on the circulant structure for further processing, using properties such as diagonalizing in the Fourier domain. The computational efficiency of KCF, which makes it ideal for low-power heterogeneous computational processing technologies, lies in its ability to compute data in high-dimensional feature space without explicitly invoking the computation on this space. Despite its strong practical potential in visual tracking, there is a need for an in-depth critical understanding of the method and its performance, which this paper aims to provide. Here we present a survey of KCF and its method along with an experimental study that highlights its novel approach and some of the future challenges associated with this method through observations on standard performance metrics in an effort to make the algorithm easy to investigate. It further compares the method against the current public benchmarks such as SOTA on OTB-50, VOT-2015, and VOT-2019. We observe that KCF is a simple-to-understand tracking algorithm that does well on popular benchmarks and has potential for further improvement. The paper aims to provide researchers a base for understanding and comparing KCF with other tracking technologies to explore the possibility of an improved KCF tracker.
Highlights
Visual tracking can be considered as finding the minimum distance in feature space between the current position of the tracked object to the subspace represented by the previously stored data or previous tracking results
We study some of the key components that contribute to the design of Kernelized Correlation Filter (KCF), a detailed overview on its working principles, and its performance under various challenging scenarios such as variation in illumination, appearance, change due to viewpoints, occlusions, speed of the subject, type of crowd, the position of the camera, etc
CF-based visual trackers, in particular, have been popular because of their speed and accuracy. One such state-of-the-art tracker was presented by the work of Henriques et al [8], which utilized a unique structure of image samples, making it computationally very fast
Summary
Visual tracking can be considered as finding the minimum distance in feature space between the current position of the tracked object to the subspace represented by the previously stored data or previous tracking results. Visual tracking has seen tremendous progress in recent years in robotics and monitoring applications. It aims to address the issues caused by noise, clutter, occlusion, illumination changes, and viewpoints (e.g., in mobile or aerial robotics). Deformation: In the case of deformation, the location of the object as a whole or object parts often vary in frames of a video. Out-of-view: Out-of-view is a state when the target, during its observation, moves out of the visible range of the sensor (for example, the camera). In out-of-view, the target disappears from the visible range altogether. In such cases, the model loses information of the target, making it harder to re-detect.
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.