Abstract
Siamese network have been extensively applied in the tracking field because of its huge speed advantage and great precision performance in solving the tracking problems. In this paper, we propose an efficient framework for real-time object tracking which is end-to-end trained offline–Fully Conventional Anchor-Free Siamese network (FCAF). Specifically, as the backbone network in Siamese trackers is relatively shallow, resulting in insufficient feature information acquired by the trackers and lower accuracy, the deep network ResNet-50 is adopted to provide richer feature representation. Meanwhile, the introduction of multi-layer feature fusion module effectively combines low-level detail information with high-level semantic features, improving the localization performance. In addition, we propose the anchor-free proposal network (AFPN) to replace the region proposal network (RPN). AFPN network consists of correlation section, implemented by depth-wise cross correlation, and supervised section which has two branches, one for classification and the other for regression. In order to suppress the prediction of low quality bounding boxes, center-ness branch is added. We conduct extensive experiments on the OTB2015 and VOT2016 public datasets, demonstrating that our proposed tracker achieves state-of-the-art performance.
Highlights
Visual object tracking is one of the most important tasks in computer vision
This section briefly introduces three aspects related to visual tracking algorithms: the trackers based on Siamese network, multi-layer feature fusion and anchor-free in detection
2 and line 5, 6 from Table 2 suggest that, comparing to region proposal network (RPN), anchor-free proposal network (AFPN) improves the area under curve (AUC) and the Prec by 0.5% and 1.3% based on AlexNet, and 0.6% and 0.5% based on ResNet-50, validating that AFPN is more effective and superior than the anchor-based RPN
Summary
Visual object tracking is one of the most important tasks in computer vision. It plays an important role in other researches such as assistant driving system [1], target behavior analysis [2], and intelligent video surveillance and control [3], [4]. A new object tracking framework, fully conventional anchor-free for object tracking (FCAF), is proposed. It is built on ResNet-50 architecture and spatially combines multiple feature layers to take advantage of the rich details of low-level features. (1) We propose an efficient object tracking framework which is end-to-end trained off-line with large-scale image pairs, and make some smart changes on the stride and receptive field of ResNet-50 to replace AlexNet to provide richer feature representation for the tracker. (3) We present the anchor-free proposal network (AFPN) to replace the RPN module, which reduces the network training parameters and improves the convergence speed, and boosts the tracking performance. (4) Our algorithm achieves state-of-the-art performance on OTB2015 and VOT2016 tracking datasets and runs in realtime
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have