A Context-Driven Extractive Framework for Generating Realistic Image Descriptions.

Amara Tariq,Hassan Foroosh

doi:10.1109/tip.2016.2628585

Amara Tariq, Hassan Foroosh

Open Access

https://doi.org/10.1109/tip.2016.2628585

Copy DOI

Abstract

Automatic image annotation methods are extremely beneficial for image search, retrieval, and organization systems. The lack of strict correlation between semantic concepts and visual features, referred to as the semantic gap, is a huge challenge for annotation systems. In this paper, we propose an image annotation model that incorporates contextual cues collected from sources both intrinsic and extrinsic to images, to bridge the semantic gap. The main focus of this paper is a large real-world data set of news images that we collected. Unlike standard image annotation benchmark data sets, our data set does not require human annotators to generate artificial ground truth descriptions after data collection, since our images already include contextually meaningful and real-world captions written by journalists. We thoroughly study the nature of image descriptions in this real-world data set. News image captions describe both visual contents and the contexts of images. Auxiliary information sources are also available with such images in the form of news article and metadata (e.g., keywords and categories). The proposed framework extracts contextual-cues from available sources of different data modalities and transforms them into a common representation space, i.e., the probability space. Predicted annotations are later transformed into sentence-like captions through an extractive framework applied over news articles. Our context-driven framework outperforms the state of the art on the collected data set of approximately 20 000 items, as well as on a previously available smaller news images data set.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Transactions on Image Processing	Publication Date: Nov 14, 2016
Citations: 85	License type: publisher-specific-oa

R Discovery Prime

R Discovery Prime

A Context-Driven Extractive Framework for Generating Realistic Image Descriptions.

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Image Processing

Lead the way for us

Similar Papers

A Multi-feature Fusion Method for Automatic Multi-label Image Annotation with Weighted Histogram Integral and Closure Regions Counting
Sen Xia ... Xiao-Ping Li
-
Sen Xia, et. al.Sen Xia ... Xiao-Ping Li
01 Jan 2015
01 Jan 2015

Context-based multi-label image annotation
Zhiwu Lu ... Qizhen He
-
Zhiwu Lu, et. al.Zhiwu Lu ... Qizhen He
08 Jul 2009
08 Jul 2009

Automatic image annotation method based on Gaussian mixture model
Na Chen
Journal of Computer Applications | VOL. 30
Na ChenNa Chen
14 Dec 2010
Journal of Computer Applications | VOL. 30

A survey and analysis on automatic image annotation
Qimin Cheng ... Sen Li
Pattern Recognition | VOL. 79
Qimin Cheng, et. al.Qimin Cheng ... Sen Li
13 Feb 2018
Pattern Recognition | VOL. 79

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Context-Driven Extractive Framework for Generating Realistic Image Descriptions.

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Image Processing