Abstract

Image description is transform the image into a text description that tells about the image, but the characters in the image and their attributes and relationships will affect the one-to-one correspondence between the text description and the image. In response to the above problems, we design an encode-decoder structure based on a self-aware target detector to extract distinct feature and role information, and design a role integrity check module to supplement the description sentences with rich object information, which improves the accuracy of image description for role in the image. We have conducted many experiments on Flickr8k and Flickr30k and on the MS COCO dataset, this method can make the generated image description and images have better semantic consistency, and compared with traditional deep learning methods and the semantic diversity has a good effect.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call