Abstract

It is difficult to recognize fine-grained objects (e.g., dogs or cars) because of the challenges of difficult region localization and fine-grained feature learning. Current approaches neglect the fact that local context feature and global feature are mutually correlated and thus it is possible to use the related information from each other. In this paper, we propose a novel multiple scales attention fusing convolutional neural network which can learn region-based feature and discriminative object attention and at multiple scales in a mutually improved way. The learning is composed of several scales which consist of a classification sub-network and an attention fusing module. A Lrefine loss was proposed to refine the second sub-network category performance. We do some overall experiments and show that our proposed method achieves the best performance in two fine-grained tasks, with relative mean accuracy gains of 2.2%, on CUB Birds, and Stanford Cars.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.