Abstract
Image description is transform the image into a text description that tells about the image, but the characters in the image and their attributes and relationships will affect the one-to-one correspondence between the text description and the image. In response to the above problems, we design an encode-decoder structure based on a self-aware target detector to extract distinct feature and role information, and design a role integrity check module to supplement the description sentences with rich object information, which improves the accuracy of image description for role in the image. We have conducted many experiments on Flickr8k and Flickr30k and on the MS COCO dataset, this method can make the generated image description and images have better semantic consistency, and compared with traditional deep learning methods and the semantic diversity has a good effect.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.