Abstract

Siamese networks used for target tracking has attracted widespread attention due to its balanced tracking accuracy and efficient execution speed. However, when there exist similar semantic information in the search area, it is difficult for most Siamese trackers to adapt to the interference of similar semantic information to target localization, which greatly affects the robustness of Siamese trackers. In order to effectively mine feature information and improve localization accuracy, this work proposes one Siamese multi-level classification and regression (SiamMCAR) framework. SiamMCAR first introduces the residual channel attention module into the template branch of Siamese subnetwork. By utilizing the relationship between feature channels to determine the channel weight of the target template feature, making the attention of extracting the template feature is focused on the channel feature of target foreground. Then, one multi-level classification and regression subnetwork containing multiple classification and regression modules is constructed. Feature maps of the output of different classification and regression modules are weighted and fused by using multiple trained weights, which enables the multi-level classification and regression subnetwork to obtain more results of classification and regression of the shallow cross-correlation response map, thereby making the localization more accurate. Extensive experiments on many benchmarks like OTB-50, OTB-100, TC-128, GOT-10K and LaSOT have proved that our SiamMCAR achieves excellent performance and runs at 28 FPS in various challenging tasks of target tracking.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call