Abstract
Region proposal network (RPN) based trackers employ the classification and regression block to generate the proposals, the proposal that contains the highest similarity score is formulated to be the groundtruth candidate of next frame. However, region proposal network based trackers cannot make the best of the features from different convolutional layers, and the original loss function cannot alleviate the data imbalance issue of the training procedure. We propose the Spatial Cascaded Transformed RPN to combine the RPN and STN (spatial transformer network) together, in order to successfully obtain the proposals of high quality, which can simultaneously improves the robustness. The STN can transfer the spatial transformed features though different stages, which extends the spatial representation capability of such networks handling complex scenarios such as scale variation and affine transformation. We break the restriction though an easy samples penalization loss (shrinkage loss) instead of smooth L1 function. Moreover, we perform the multi-cue proposals re-ranking to guarantee the accuracy of the proposed tracker. We extensively prove the effectiveness of our proposed method on the ablation studies of the tracking datasets, which include OTB-2015 (Object Tracking Benchmark 2015), VOT-2018 (Visual Object Tracking 2018), LaSOT (Large Scale Single Object Tracking), TrackingNet (A Large-Scale Dataset and Benchmark for Object Tracking in the Wild) and UAV123 (UAV Tracking Dataset).
Highlights
Visual tracking has drawn constant attention of the researchers and engineers over last decades.Some novel applications are inspired by the improvement of related research, such as auto-track by drone [1], pose recognition by mobile payment [2], and remote control by space robot [3]
Much progress [5] has made by the combined region proposal networks (RPN) and Siamese networks recently [6]
We train the SCTRPN by random interval sampling the images from the same sequences
Summary
Visual tracking has drawn constant attention of the researchers and engineers over last decades.Some novel applications are inspired by the improvement of related research, such as auto-track by drone [1], pose recognition by mobile payment [2], and remote control by space robot [3]. The researchers are making much progress persistently, it is still a vital problem to achieve a tracking procedure that simultaneously balances the accuracy, robustness, and tracking speed under complex scenarios, such as occlusion, illumination change, and scale variation, to name a few [4]. Much progress [5] has made by the combined region proposal networks (RPN) and Siamese networks recently [6]. Some of the trackers treat the tracking problems as the generation of the similarity response map, which could distinguish the differences between the target templates and the search candidates. The position candidates where reach the highest similarity score is performed as the new target groundtruth. SiamRPN [7] combines Siamese networks and region proposal networks in order to jointly perform classification and regression for tracking. The DaSiamRPN [8] comes up with the distractor-aware module to distinguish hard negatives from easy ones, which could improve
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.