Leveraging multi-modal fusion for graph-based image annotation

S Hamid Amiri,Mansour Jamzad

doi:10.1016/j.jvcir.2018.08.012

Abstract

Considering each of the visual features as one modality in image annotation task, efficient fusion of different modalities is essential in graph-based learning. Traditional graph-based methods consider one node for each image and combine its visual features into a single descriptor before constructing the graph. In this paper, we propose an approach that constructs a subgraph for each modality in such a way that edges of subgraph are determined using a search-based approach that handles class-imbalance challenge in the annotation datasets. Multiple subgraphs are then connected to each other to have a supergraph. This follows by introducing a learning framework to infer the tags of unannotated images on the supergraph. The proposed approach takes advantages of graph-based semi-supervised learning and multi-modal representation simultaneously. We evaluate the performance of the proposed approach on different datasets. The results reveal that the proposed approach improves the accuracy of annotation systems.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Leveraging multi-modal fusion for graph-based image annotation

Abstract

Talk to us

Similar Papers

More From: Journal of Visual Communication and Image Representation

Lead the way for us

Journal: Journal of Visual Communication and Image Representation	Publication Date: Aug 1, 2018
Citations: 6

Similar Papers

Automatic Image Annotation Based on Sparse Representation and Multiple Label Learning
Feng Tian ... Shang Fu-Hua
-
Feng Tian, et. al.Feng Tian ... Shang Fu-Hua
01 Sep 2012
01 Sep 2012

Image Annotation by Propagating Labels from Semantic Neighbourhoods
Yashaswi Verma ... C V Jawahar
International Journal of Computer Vision | VOL. 121
Yashaswi Verma, et. al.Yashaswi Verma ... C V Jawahar
12 Jul 2016
International Journal of Computer Vision | VOL. 121

Semantics-Preserving Bag-of-Words Models and Applications
Lei Wu ... Nenghai Yu
IEEE Transactions on Image Processing | VOL. 19
Lei Wu, et. al. Lei Wu ... Nenghai Yu
11 Mar 2010
IEEE Transactions on Image Processing | VOL. 19

Purposive Hidden-Object-Game: Embedding Human Computation in Popular Game
Jiashi Feng ... Zilei Wang
IEEE Transactions on Multimedia | VOL. 14
Jiashi Feng, et. al. Jiashi Feng ... Zilei Wang
01 Oct 2012
IEEE Transactions on Multimedia | VOL. 14

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Leveraging multi-modal fusion for graph-based image annotation

Abstract

Talk to us

Similar Papers

More From: Journal of Visual Communication and Image Representation