Abstract

The objective of description models is to generate image captions to elaborate contents. Despite recent advancements in machine learning and computer vision, generating discriminative captions still remains a challenging problem. Traditional approaches imitate frequent language patterns without considering the semantic alignment of words. In this work, an image captioning framework is proposed that generates topic sensitive descriptions. The model captures the semantic relation and polysemous nature of the words that describe the images and resultantly generates superior descriptions for the target images. The efficacy of the proposed model is indicated by the evaluation on the state-of-the-art captioning datasets. The model shows promising performance compared to the existing description models proposed in the recent literature.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.