Image Description Generator using Deep Learning

Deepak R Ksheerasagar

doi:10.22214/ijraset.2022.45988

Abstract

Abstract: To recognise the context of an image and describe it in a natural language like English, the fundamental task of creating image captions uses computer vision and natural language processing techniques. To create a natural language description from an input image, image caption generation is used. Convolutional Neural Network (CNN) model and Long Short-Term Memory (LSTM) model are the two parts of this Python project that are used to implement it. The CNN-LSTM architecture combines a Convolutional Neural Network (CNN), which creates features that describe the images, with a Long Short-Term Memory (LSTM), a type of Recurrent Neural Network (RNN), which precisely structures meaningful sentences out of the generated data. The ability to automatically describe an image's content has a variety of uses, including helping visually impaired people better understand the content of images and providing more precise and condensed image information for social media

Full Text