Abstract
Artificial intelligence research has long focused on the task of automatically defining what is included in a picture or image. This paper presents the CNN and RNN-LSTM models used in the Automatic Caption Generator implementation. It takes into account recent developments in image processing and performing research on translation through automation. The dataset utilized was Flickr8k. We have employed BLEU scores to assess the efficiency of the system that has been described. Based on the scores, one could classify the resulting captions as excellent or awful. The primary uses of this paradigm are virtual assistants, picture indexing, social media, assistive technology for the blind, altering app suggestions, and many more areas. Key Words: Automatic Caption Generator, CNN, RNN-LSTM, Computer vision, Machine translation, Flickr8k dataset, BLEU scores, Performance evaluation, Virtual assistants, Picture indexing, Social media, Assistive technology for the blind, App suggestions.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: INTERANTIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING AND MANAGEMENT
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.