Image Semantic Information Research Articles

Automatic caption generation from images is an interesting and mainstream direction in the field of machine learning. This method enables us to build a powerful computer model that can interpret the implicit semantic information of images. However, the current state of research faces significant challenges such as those related to extracting robust image features, suppressing noisy words, and improving a caption’s coherence. For the first problem, a novel computer vision system is presented to create a new image feature called MK–KDES-1 (MK–KDES represents Multiple Kernel–Kernel Descriptors) after extracting three KDES features and fusing them by MKL (Multiple Kernel Learning) model. The MK–KDES-1 feature captures both textural characteristics and shape characteristics of images, which contribute to improving the BLEU_1 (BLEU represents Bilingual Evaluation Understudy) scores of captions. For the second problem, an effective newly designed two-layer TR (Tag Refinement) strategy is integrated into our NLG (Natural Language Generation) algorithm. Words that are most relevant semantically to images are summarized to generate N-gram phrases. Noisy words are suppressed using the innovative TR strategy. For the last problem, on the one hand, a pop WE (Word Embeddings) model and a novel metric called PDI (Positive Distance Information) are introduced together to generate N-gram phrases. The phrases are evaluated by the AWSC (Accumulated Word Semantic Correlation) metric. On the other hand, the phrases are fused to generate captions by the ST (Syntactic Trees). Experimental results demonstrate that informative captions with high BLEU_3 scores can be obtained to describe images.

Recently, attributes that contain high-level semantic information of image are always used as a complementary knowledge to improve image captioning performance. However, the use of attributes in prior works cannot excavate the latent visual concepts effectively. At each time step, the semantic information which is sensitive to the predicted word could be different. In this paper, we propose a Dense Semantic Embedding Network (DSEN) for this task. The distinct operation of this network is to densely embed the attributes with the multi-modal of image and text at each step of word generation. The discriminative semantic information hidden in these attributes is formatted in form of global likelihood probabilities. As a result, this dense embedding can modulate the feature distributions of the image, text modals and the hidden states to explicit semantic representation. Furthermore, to improve the discrimination of attributes, a Threshold ReLU (TReLU) is proposed. In addition, a bidirectional LSTM structure is incorporated into the DSEN to capture both the previous and future contexts. Extensive experiments on the COCO and Flickr30K datasets achieve superior results when compared with the state-of-the-art models for the tasks of both image captioning and image-text cross modal retrieval. Most remarkably, our method obtains outstanding performance on the retrieval task, compared with the state-of-the-art models.

Image Semantic Information Research Articles

Related Topics

Articles published on Image Semantic Information

Scene classification based on the bag-of-visual-words and Doc2Vec models for high-spatial resolution remote-sensing imagery

Novel model to integrate word embeddings and syntactic trees for automatic caption generation from images

Similarity-preserving hashing based on deep neural networks for large-scale image retrieval

Dense semantic embedding network for image captioning

Building Emotional Machines: Recognizing Image Emotions Through Deep Neural Networks

A survey on automatic image caption generation

Image annotation refinement via 2P-KNN based group sparse reconstruction

Joint multi-view representation and image annotation via optimal predictive subspace learning

Emotional image color transfer via deep learning

Geometrical flow‐guided fast beamlet transform for crack detection

Semantic-aware blind image quality assessment

Hybrid textual-visual relevance learning for content-based image retrieval

Hierarchical deep hashing for image retrieval

An Integrated Model for Effective Saliency Prediction

Supervised Hashing Based on the Dimensions’ Value Cardinalities of Image Descriptors

Boosted random contextual semantic space based representation for visual recognition

Adopting Abstract Images for Semantic Scene Understanding.

Synthesizing Ornamental Typefaces

Semantic Image Analysis for Intelligent Image Retrieval

An EL-LDA based general color harmony model for photo aesthetics assessment

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Image Semantic Information Research Articles

Related Topics

Articles published on Image Semantic Information

Scene classification based on the bag-of-visual-words and Doc2Vec models for high-spatial resolution remote-sensing imagery

Novel model to integrate word embeddings and syntactic trees for automatic caption generation from images

Similarity-preserving hashing based on deep neural networks for large-scale image retrieval

Dense semantic embedding network for image captioning

Building Emotional Machines: Recognizing Image Emotions Through Deep Neural Networks

A survey on automatic image caption generation

Image annotation refinement via 2P-KNN based group sparse reconstruction

Joint multi-view representation and image annotation via optimal predictive subspace learning

Emotional image color transfer via deep learning

Geometrical flow‐guided fast beamlet transform for crack detection

Semantic-aware blind image quality assessment

Hybrid textual-visual relevance learning for content-based image retrieval

Hierarchical deep hashing for image retrieval

An Integrated Model for Effective Saliency Prediction

Supervised Hashing Based on the Dimensions’ Value Cardinalities of Image Descriptors

Boosted random contextual semantic space based representation for visual recognition

Adopting Abstract Images for Semantic Scene Understanding.

Synthesizing Ornamental Typefaces

Semantic Image Analysis for Intelligent Image Retrieval

An EL-LDA based general color harmony model for photo aesthetics assessment