Abstract

We propose a new visual tracking algorithm leveraging multi-level visual attention to take full use of the information during tracking. Visual attention has been widely applied in many visual tasks, such as image captioning and question answering. However, most existing attention models only focus on one or two aspects, ignoring the other useful information in visual tracking. Here, we think there are four main attentional aspects in the tracking task and propose a unified network to leverage multi-level visual attention, which includes layer-wise attention, temporal attention, spatial attention and channel-wise attention. Considering that deep features of different levels may be suitable for different scenarios, we propose to train an attention network in the off-line stage to facilitate feature selection in online tracking. To better exploit the temporal consistency assumption of visual tracking, we implement the attention network with long short term memory (LSTM) units, which are capable of capturing the historical context information to perform more reliable inference at the current time step. Different from the image classification task, background clutter is more complicated in the tracking task. Thus, we purify the features by spatial attention and channel-wise attention to effectively suppress the background noise and highlight the target region. In addition, we also enforce deep feature sharing across target candidates using Region of Interest pooling, allowing the features of all candidates to be extracted in only one forward pass of the DNN. To further improve tracking accuracy, a promoting strategy for trackers with detection results of a generic object detector is proposed, reducing the risk of tracking drifts. The proposed tracking algorithm compares favorably against state-of-the-art methods on three popular benchmark datasets. Extensive experimental evaluations demonstrate the effectiveness of the proposed techniques.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.