Abstract

Visual tracking has recently gained a great advance with the use of the convolutional neural network (CNN). Usually, existing CNN‐based trackers exploit the features from a single layer or a certain combination of multiple layers. However, these features only characterise an object from an invariable aspect and cannot adapt to scene variation, which limits the performance of such trackers. To overcome this limitation, the authors study the problem from a new perspective and propose a novel convolutional layer selection method. To obtain robust appearance representation, they investigate the advantages of features extracted from different convolutional layers. To determine the correctness of the tracking prediction and updated model, they design a verification mechanism based on historical retrospect, which can estimate the deviation for each layer by bidirectionally locating the target. Meanwhile, the deviation works as the layer‐wise selection criteria. Extensive evaluations on the OTB‐2013, visual object tracking (VOT)‐2016 and VOT‐2017 benchmarks demonstrate that the proposed tracker performs favourably against several state‐of‐the‐art trackers.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call