Abstract

Image captioning or picture captioning has become one of the most widely used technologies in applications that generate and provide captions for specific photographs. All these things are done with the help of deep neural networks. It identifies the specific objects in an image and their attributes and relationships. The purpose of this research is to find different things in a photograph, figure out their relationships, and write captions. The proposed system is implemented on dataset Flickr8k along with python. The input images are pre-processed and then features from images are extracted using CNN. To translate the features and objects extracted by CNN to a natural sentence in English LSTM is utilized in the implementation. Different types of images are tested with the proposed system. The results are presented with the generated image captions. The results presented shows the accuracy of the system. The presented method has potentials for such applications where image captioning is essential.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.