Siamese Visual Tracking With Deep Features and Robust Feature Fusion

Daqun Li,Yi Yu,Xize Wang

doi:10.1109/access.2019.2962388

Abstract

Trackers based on fully-convolutional Siamese networks regard tracking as a process of learning a similarity function. By utilizing shallow networks and off-line training, Siamese trackers can achieve high tracking speed and perform well in some simple scenes. However, due to the less semantic information and the invariant template, Siamese trackers still have a gap compared with the state-of-the-art methods in complex scenes and other challenging problems (Occlusion, Deformation, etc.). In this paper, we propose a Siamese tracking algorithm with deep features and robust feature fusion (SiamDF). The improved ResNet-18 network is utilized to replace the traditional shallow network and extract the deep features with more semantic information. For eliminating the negative effect of padding and making better use of the deep network, the proposed algorithm adopts the spatial aware sampling strategy to overcome the strict translation invariance. Meanwhile, a final response map with high quality can be obtained by using the multi-layer feature fusion. Thus, the tracker can significantly reduce the impact of the distractors in complex scenes. In addition, an adaptive feature information fusion is adopted to update the template, so that the algorithm can adapt to various changes of the target appearance. Objective evaluation on the OTB100 dataset shows that the precision and the overlap success can reach 0.852 and 0.658 respectively. Moreover, the EAO value evaluated on the VOT2016 database can reach 0.336. These results demonstrate that our algorithm can effectively improve the tracking performance and perform favorably in both robustness and accuracy.

Highlights

As an important direction in the field of computer vision, visual tracking has been highly concerned by researchers all the time
For eliminating the negative effect of padding and making better use of the deep network, the proposed algorithm adopts the spatial aware sampling strategy to overcome the strict translation invariance, and a final response map with high quality can be obtained by using the multi-layer feature fusion
The tracker can significantly reduce the impact of the distractors in complex scenes

Summary

INTRODUCTION

As an important direction in the field of computer vision, visual tracking has been highly concerned by researchers all the time. Even if the SiamRPN++ tracker [19] successfully introduces the deep architecture into the algorithm, the feature information of the multi-layer network has not been fused reasonably. For eliminating the negative effect of padding and making better use of the deep network, the proposed algorithm adopts the spatial aware sampling strategy to overcome the strict translation invariance, and a final response map with high quality can be obtained by using the multi-layer feature fusion. We adopt an adaptive feature information fusion to update the constant template in Siamese network This makes the tracker more adaptive to various changes of the target appearance. Inspired by the above studies, we propose a novel method to achieve multi-layer feature fusion, which can obtain a response map with high quality and significantly reduce the impact of the distractors in complex scenes. M∈M where M denotes the response map after cross-correlation, h [m] ∈ {+1, −1} and r [m] indicate the label and the score, m ∈ M is the position in response map M . l is the logistic loss:

MULTI-LAYER FEATURE FUSION

Findings

CONCLUSION

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Access	Publication Date: Jan 1, 2020
Citations: 5	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Siamese Visual Tracking With Deep Features and Robust Feature Fusion

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access

Lead the way for us

Similar Papers

Siamese visual tracking with multilayer feature fusion and corner distance IoU loss
Weisheng Li ... Junye Zhu
Journal of Visual Communication and Image Representation | VOL. 89
Weisheng Li, et. al.Weisheng Li ... Junye Zhu
01 Nov 2022
Journal of Visual Communication and Image Representation | VOL. 89

Discriminative Siamese Tracker Based on Multi-Channel-Aware and Adaptive Hierarchical Deep Features
Huanlong Zhang ... Fengxian Wang
Symmetry | VOL. 13
Huanlong Zhang, et. al.Huanlong Zhang ... Fengxian Wang
05 Dec 2021
Symmetry | VOL. 13

Multi-layer Adaptive Feature Fusion for Semantic Segmentation
Yizhen Chen ... Haifeng Hu
Neural Processing Letters | VOL. 51
Yizhen Chen, et. al.Yizhen Chen ... Haifeng Hu
24 Oct 2019
Neural Processing Letters | VOL. 51

DASTSiam: Spatio‐temporal fusion and discriminative enhancement for Siamese visual tracking
Yucheng Huang ... Bin Zhu
IET Computer Vision | VOL. 17
Yucheng Huang, et. al.Yucheng Huang ... Bin Zhu
19 Jun 2023
IET Computer Vision | VOL. 17

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Siamese Visual Tracking With Deep Features and Robust Feature Fusion

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access