Abstract

Multi-label classification is one of the most essential tasks of computer vision. In the multi-label image classification model, using the correlation between labels is a powerful method for improving the classification effectiveness of the model. However, common methods ignore the interrelationship between the label pairs. On the other hand, introducing a spatial attention mechanism into the model could also improve the classification effectiveness of the model. However, most methods that use the attention mechanism module do not use the correlation information between labels. To solve these issues, we propose a novel multi-label image classification model using the label correlation in the paper. Our model generates label word vectors based on the BERT model that can describe the potential relationship between labels. And then we combine these vectors with static statistics information on labels to construct a new label correlation matrix. Moreover, we introduce label semantic information into the spatial attention mechanism. With the semantic information, the generated spatial attention map could focus on the image feature regions with label correlation, and complete the accurate classification of the model. On the Microsoft COCO data set, this model achieves the best score of 84.3% on mAP, which shows the effectiveness of our model.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call