Abstract

Convolutional neural networks have an advantage of strong representation and has been widely applied to visual tracking. However, simply deepening network in pursuit of boosting performance is inappropriate for tracking because of its speed requirement. We leverage spatial attention and channel attention to enhance features of objects without much extra computational cost. The fused spatial-channel attention enables network to extract discriminative and robust features of targets or background. Furthermore, we propose inter-instance loss to make our tracker be aware of not only target-background classification but also instances classification across multi-domains. Extensive experiments on Object Tracking Benchmark (OTB) show that the proposed tracker obtains an Area-Under-Curve (AUC) score of 66.8% on OTB2015, outperforming most of the state-of-art trackers.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.