Abstract

Multi-label image classification aims to predict the labels associated with a given image. While most existing methods utilize unified image representations, extracting label-specific features through input space learning would improve the discriminative power of the learned features. On the other hand, most feature learning studies often ignore the learning in the output label space, although taking advantage of label correlations can boost the classification performance. In this paper, we propose a deep learning framework that incorporates flexible modules which can learn from both input and output spaces for multi-label image classification. For the input space learning, we devise a label-specific feature pooling method to refine convolutional features for obtaining features specific to each label. For the output space learning, we design a Two-Stream Graph Convolutional Network (TSGCN) to learn multi-label classifiers by mapping spatial object relationships and semantic label correlations. More specifically, we build object spatial graphs to characterize the spatial relationships among objects in an image, which supplements the label semantic graphs modelling the semantic label correlations. Experimental results on two popular benchmark datasets (i.e., Pascal VOC and MS-COCO) show that our proposed method achieves superior performance over the state-of-the-arts.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.