Abstract

Image-text matching using the image caption method has made a great progress. However, there are many named entities in news text, and existing approaches are unable to directly generate named entities in the news image caption. It leads to a semantic gap between text and news image caption. Moreover, the existing methods lack the analysis of indirect relations between named entities. Therefore those approaches easily leads to relations error when generating news image caption. To generate the news image caption with named entities by analyzing the indirect relations between named entities. We propose a novel model. In details, we propose the TopNews dataset with related news, which aims to construct the relations between named entities as widely as possible. Then we develop the news knowledge graph by extracting named entities from TopNews dataset. Furthermore, we propose News Knowledge Driven Graph Neural Network (NKD-GNN). We utilize NKD-GNN to analyzing the whole relations of entities in news knowledge graph. In this way, we generate the news image caption with named entities. The results of extensive experiments based on TopNews dataset and common dataset demonstrate that our approach is effective in detecting the consistency of news images and text.

Highlights

  • D ETECTING the consistency of image and text by using the image caption method has attracted increasing attention in recent years [1]–[5].,due to the semantic gap between news text and news image, calculating the consistency between news text and news image is still a challenging problem.Recently, some works have tried to overcome the semantic gap of news images and text

  • We come to a conclusion, detect the consistency of news image-text needs to analyze the indirect relations between named entities in news text

  • The news image caption generated in this way has similar background knowledge of the news text

Read more

Summary

Introduction

D ETECTING the consistency of image and text by using the image caption method has attracted increasing attention in recent years [1]–[5].,due to the semantic gap between news text and news image, calculating the consistency between news text and news image is still a challenging problem.Recently, some works have tried to overcome the semantic gap of news images and text. These methods first generate a template caption with placeholders for named entities These methods connect the named entities of news text into a graph. Afterwards the best candidate for each placeholder is chosen via analyze the direct relation between named entities such as the co-occurrence rate of adjacent entities in the graph. Those approaches have made significant improvements in image-text matching. These methods are limited to ignoring the indirect relations between named entities, which leads to relations error when choosing entity candidates.

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call