Abstract

In this study, for the image caption generation in the Sinhala language, we have implemented a Recurrent Neural Network based model consisting of an InceptionV3 model as an image feature extraction model and a Long Short Term Memory network for the language model by referring to the literature. The different variations of Sinhala versions of the Flickr8K and MS COCO datasets have been constructed and used to train experimental models. Evaluation of the generated captions has been done using both automated and manual approaches. The model trained on the MS COCO dataset with Google translated Sinhala captions has achieved the highest BLEU score of 0.592 and the highest METEOR score of 0.281. After doing the manual caption analysis, it was observed that there could be generated captions which could provide a good idea to the reader while having lower BLEU and METEOR scores.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.