A novel visual classification framework on panoramic attention mechanism network

Wenshu Li,Lingzhi Yin,Xiaoying Guo,Shenhao Li,Xu Yang

doi:10.1049/cvi2.12105

Wenshu Li, Lingzhi Yin + Show 3 more

Open Access

https://doi.org/10.1049/cvi2.12105

Copy DOI

Abstract

Fine-grained classification is a challenging task due to the difficulty of finding discriminative features and the localization of feature regions. To handle these challenges, a novel visual classification framework on panoramic attention mechanism that combines multiple attention networks to locate and identify features with more semantic interest is proposed. Firstly, based on the classical convolutional neural network, the global information of the image feature is expressed by linear fusion. Secondly, the foreground attention branch is used to further extract the distinguishing details of the salient features. Then, more features are mined from the complementary object area through the background attention branch to learn more perfect fine-grained feature expression. Finally, three network branches are trained together to enhance the network's ability to express representative features of fine-grained images. Our model can be viewed as a multi-branch network, which benefits each other and optimizes the network together. Experiments were conducted on CUB-200-2011, Stanford Dogs and FGVC-Aircraft datasets, and the accuracy was used as the quantitative measurement. Experimental results show that the proposed method has the highest accuracy; the average accuracy is 89.8%. It is effective and superior to the current advanced methods.

Full Text