Abstract

Recently, tracking models based on bounding box regression (such as region proposal networks), built on the Siamese network, have attracted much attention. Despite their promising performance, these trackers are less effective in perceiving the target information in the following two aspects. First, existing regression models cannot take a global view of a large-scale target since the effective receptive field of a neuron is too small to cover the target with a large scale. Second, the neurons with a fixed receptive field (RF) size in these models cannot adapt to the scale and aspect ratio changes of the target. In this paper, we propose an adaptive ensemble perception tracking framework to address these issues. Specifically, we first construct a per-pixel prediction model, which predicts the target state at each pixel of the correlated feature. On top of the per-pixel prediction model, we then develop a confidence-guided ensemble prediction mechanism. The ensemble mechanism adaptively fuses the predictions of multiple pixels with the guidance of confidence maps, which enlarges the perception range and enhances the adaptive perception ability at the object-level. In addition, we introduce a receptive field adaption model to enhance the adaptive perception ability at the neuron-level, which adjusts the RF by adaptively integrating the features with different RFs. Extensive experimental results on the VOT2018, VOT2016, UAV123, LaSOT, and TC128 datasets demonstrate that the proposed algorithm performs favorably against the state-of-the-art methods in terms of accuracy and speed.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.