Abstract
Image captioning refers to generating a sentence description by analyzing the image. The objective of image captioning is to automatically generate these captions for an image to gain a deeper knowledge by using deep learning algorithms. In this paper, an image sentence generation based on deep neural network using RCNN-LSTM model is proposed. In the proposed model an image is taken as input and generate sentence as an output by making use of natural language processing for describing the contents of the image. We have developed this model by consistently analyzing a deep neural networks and image sentence generation methodologies. The scheme uses image datasets and their sentence descriptions to train and test the model and have balance between language and visual data. This research paper uses Recurrent Convolutional Neural Networks (RCNN) a combination of recurrent Neural Networks (RNN) and Convolutional Neural Networks (CNN). RNN is used to process language part and CNN is used to process image part for obtaining feature vectors. Additionally, Long Short Term Memory (LSTM) is used for textual sentence generation. In this model, RCNN works as an encoder to retrieve features for the images by making use of Keras VGG16 and LSTM works as a decoder to obtain textual sentences which describes the images. In our approach we have used Flickr-8k, Flickr30K and MSCOCO dataset to train the model. The model image sentence generation achieved a very good accuracy for generating captions.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.