Abstract
With the development of internet technology, the video data has been widely used in multimedia devices, such as video surveillance, webcast, and so on. Lots of visual processing algorithms are developed to handle the corresponding visual task, but the challenging problems still exist. In this paper, we propose a weighted multiple instances based deep correlation filter for visual tracking processing, which utilizes the importance of instances for training of deep learning model and correlation filter. First, the initial object appearance is modeled based on the confidence of the object and background at the first frame. During the tracking, the superpixel is used to capture the object appearance variations. Most importantly, our tracker can enhance the discriminative ability of the object using deep residual network and improve the tracking efficiency with correlation filter. Second, we introduce the sample importance into residual deep learning model to improve the training performance. We define the importance of each instance by computing the sore of all the pixels within the corresponding instance. Third, we update the parameters of deep learning network and correlation filter in a fixed interval frames to reduce the object drift. Extensive experiments on the OTB2015 benchmark and VOT2018 dataset demonstrate that the proposed object tracking algorithm outperforms the state-of-the-art tracking algorithms.
Highlights
For video security, the sensitive video content needs to be protected before transmission
The major contributions of this paper are of threefold: (1) We propose a weighted multiple instances based deep correlation filter for visual tracking
The results demonstrate the importance of weighted multiple instances learning in visual tracking, especially in background clutter (BC) and OCC
Summary
The sensitive video content needs to be protected before transmission. MDNet [18] shows the best tracking performance on the VOT2015 challenge Another approach consists of training a fully convolutional network and uses a feature map selection method to choose discriminative features between shallow and deep layers [19]. Deep learning based trackers are often offline trained in an image classification dataset, and online fine-tuning the parameters of deep learning network during the tracking. These trackers are less discriminative in various objects tracking domain. The experimental results demonstrate that the proposed visual object tracking algorithm performs favorably against other conventional deep learning trackers.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.