Image Annotation Refinement Research Articles

Visual Question Answering (VQA) is a hot-spot in the intersection of computer vision and natural language processing research and its progress has enabled many in high-level applications. This work aims to describe a novel VQA model based on semantic concept network construction and deep walk. Extracting visual image semantic representation is a significant and effective method for spanning the semantic gap. Moreover, current research has shown that co-occurrence patterns of concepts can enhance semantic representation. This work is motivated by the challenge that semantic concepts have complex interrelations and the relationships are similar to a network. Therefore, we construct a semantic concept network adopted by leveraging Word Activation Forces (WAFs), and mine the co-occurrence patterns of semantic concepts using deep walk. Then the model performs polynomial logistic regression on the basis of the extracted deep walk vector along with the visual image feature and question feature. The proposed model effectively integrates visual and semantic features of the image and natural language question. The experimental results show that our algorithm outperforms competitive baselines on three benchmark image QA datasets. Furthermore, through experiments in image annotation refinement and semantic analysis on pre-labeled LabelMe dataset, we test and verify the effectiveness of our constructed concept network for mining concept co-occurrence patterns, sensible concept clusters, and hierarchies.

Read full abstract

Recently, images on the Web and personal computers are prevalent around the human's life. To retrieve effectively these images, there are many (Automatic Image Annotation) AIA algorithms. However, it still suffers from low-level accuracy since it couldn't overcome the semantic-gap between low-level features (`color', `texture' and `shape') and high-level semantic meanings (e.g., `sky', `beach'). Namely, AIA techniques annotates images with many noisy keywords. In this paper, we propose a novel approach that augments the classical model with generic knowledge-based, WordNet. Our novel approach strives to prune irrelevant keywords by the usage of WordNet. To identify irrelevant keywords, we investigate various semantic similarity measures between keywords and finally fuse outcomes of all these measures together to make a final decision using Dempster-Shafer evidence combination. Furthermore, We can re-formulate the removal of erroneous keywords from image annotation problem into graph-partitioning problem, which is weighted MAX-CUT problem. It is possible that we have too many candidate keywords for web-images. Hence, we need to have deterministic polynomial time algorithm for MAX-CUT problem. We show that finding optimal solution for removing noisy keywords in the graph is NP-Complete problem and propose a new methodology for Knowledge Based Image Annotation Refinement (KBIAR) using a deterministic polynomial time algorithm, namely, randomized approximation graph algorithm. Finally, we demonstrate the superiority of this algorithm over traditional one including the most recent work for a benchmark dataset.

Read full abstract

Image Annotation Refinement Research Articles

Related Topics

Articles published on Image Annotation Refinement

The Image Annotation Refinement in Embedding Feature Space based on Mutual Information

Semantic Concept Network and Deep Walk-based Visual Question Answering

Image annotation refinement via 2P-KNN based group sparse reconstruction

Semi-Supervised Learning Model Based Efficient Image Annotation

Knowledge Based Image Annotation Refinement

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Image Annotation Refinement Research Articles

Related Topics

Articles published on Image Annotation Refinement

The Image Annotation Refinement in Embedding Feature Space based on Mutual Information

Semantic Concept Network and Deep Walk-based Visual Question Answering

Image annotation refinement via 2P-KNN based group sparse reconstruction

Semi-Supervised Learning Model Based Efficient Image Annotation

Knowledge Based Image Annotation Refinement