Abstract

Conventional convolution neural network (CNN)-based visual trackers are easily influenced by too much background information in candidate samples. Further, extreme imbalance of foreground and background samples has a negative impact on training the classifier, whereas features learned from limited data are insufficient to train the classifier. To address these problems, we propose a novel deep neural network for visual tracking, termed the domain activation mapping guided network (DA-GNT). First, we introduce the class activation mapping with weakly supervised localization in multi-domain to identify the most discriminative regions in the bounding box and suppress the background in the positive sample. Next, to further increase the discriminability of deep feature representation, we utilize an ensemble network to achieve a kind of multi-view feature representation and a channel attention mechanism for adaptive feature selection. Finally, we propose a simple but effective data augmentation method to further increase the positive samples for our network training. Extensive experiments on two widely used benchmark datasets demonstrate the effectiveness of the proposed tracking method against many state-of-the-art trackers. The novel DA-GNT is thus posited as a potential benchmark resource for the computer vision and machine learning research community.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.