Beyond Word Embeddings: Heterogeneous Prior Knowledge Driven Multi-Label Image Classification

Xiang Deng,Gengyu Lyu,Congyan Lang,Tao Wang,Songhe Feng

doi:10.1109/tmm.2022.3171095

Abstract

Multi-Label Image Classification (MLIC) is a fundamental yet challenging task which aims to recognize multiple labels from given images. The key to solve MLIC lies in how to accurately model the correlation between labels. Recent studies often adopt Graph Convolutional Network (GCN) to model label dependencies with word embeddings as prior knowledge. However, classical word embeddings typically contain redundant information due to the imperfect distributional hypothesis it relies on, which may degrade model generalizability. To tackle this problem, we propose a novel deep learning framework termed Visual-Semantic based Graph Convolutional Network (VSGCN), which alleviates the negative impact of redundant information by utilizing heterogeneous sources of prior knowledge. Specifically, we construct both visual prototype and semantic prototype for each label as heterogeneous prior label representations, which are further mapped to multi-label classifiers via two Multi-Head GCNs separately. The Multi-Head GCN mechanism proposed in this paper aims to guide the information propagation between prototypes for each label, which constructs multiple correlation graphs to simultaneously model the label correlation in different subspaces. Notably, we alleviate the negative influence of needless information by decreasing the inconsistency of predictions that come from visual space and semantic space. Extensive experiments conducted on various multi-label image datasets demonstrate the superiority of our proposed method.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Beyond Word Embeddings: Heterogeneous Prior Knowledge Driven Multi-Label Image Classification

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Multimedia

Lead the way for us

Journal: IEEE Transactions on Multimedia	Publication Date: Jan 1, 2023
Citations: 7

Similar Papers

Multi-Label Image Classification with Attention Mechanism and Graph Convolutional Networks
Quanling Meng ... Weigang Zhang
-
Quanling Meng, et. al.Quanling Meng ... Weigang Zhang
15 Dec 2019
15 Dec 2019

Zero shot learning based on class visual prototypes and semantic consistency
Xiao Li ... Jinqiao Wu
Pattern Recognition Letters | VOL. 135
Xiao Li, et. al.Xiao Li ... Jinqiao Wu
08 May 2020
Pattern Recognition Letters | VOL. 135

Word Embeddings for Natural Language Processing

-

01 Jan 2015
01 Jan 2015

Transformer Driven Matching Selection Mechanism for Multi-Label Image Classification
Yanan Wu ... Yi Jin
IEEE Transactions on Circuits and Systems for Video Technology | VOL. 34
Yanan Wu, et. al.Yanan Wu ... Yi Jin
01 Feb 2024
IEEE Transactions on Circuits and Systems for Video Technology | VOL. 34

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Beyond Word Embeddings: Heterogeneous Prior Knowledge Driven Multi-Label Image Classification

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Multimedia