Automatic Audio and Image Caption Generation with Deep Learning

K Lavanya,B Jayamala,C Jeyasri,A Sakthivel

doi:10.34293/sijash.v11is3-july.7916

K Lavanya, B Jayamala + Show 2 more

Open Access

https://doi.org/10.34293/sijash.v11is3-july.7916

Copy DOI

Abstract

A novel approach to image caption generation tailored specifically for visually impaired individuals. The proposed system employs advanced computer vision algorithms to analyze images and generate descriptive textual captions. Furthermore, it integrates seamless text-to-speech conversion functionality, allowing for the automatic transformation of these captions into spoken audio, thereby enabling access to visual content for individuals with visual impairments. The goal of this project is to generate descriptive captions for a given photograph or image. We achieve this by employing Convolutional Neural Networks (CNN) and Long Short-Term Memory (LSTM) models, both of which are advanced deep learning techniques. Using computer vision, the system identifies the content of the image and generates a relevant caption. This caption is then converted into audio using Natural Language Processing (NLP).

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Automatic Audio and Image Caption Generation with Deep Learning

Abstract

Talk to us

Similar Papers

More From: Shanlax International Journal of Arts, Science and Humanities

Lead the way for us

Journal: Shanlax International Journal of Arts, Science and Humanities	Publication Date: Jul 8, 2024
License type: CC BY-SA 4.0

Similar Papers

An accurate generation of image captions for blind people using extended convolutional atom neural network.
Tejal Tiwary ... Rajendra Prasad Mahapatra
Multimedia Tools and Applications | VOL. 82
Tejal Tiwary, et. al.Tejal Tiwary ... Rajendra Prasad Mahapatra
15 Jul 2022
Multimedia Tools and Applications | VOL. 82

An Infrared Array Sensor-Based Approach for Activity Detection, Combining Low-Cost Technology with Advanced Deep Learning Techniques.
Krishnan Arumugasamy Muthukumar ... Tomoaki Ohtsuki
Sensors (Basel, Switzerland) | VOL. 22
Krishnan Arumugasamy Muthukumar, et. al.Krishnan Arumugasamy Muthukumar ... Tomoaki Ohtsuki
20 May 2022
Sensors (Basel, Switzerland) | VOL. 22

Image Caption Generator by using CNN and LSTM
S Pasupathy
International Journal For Multidisciplinary Research | VOL. 5
S Pasupathy S Pasupathy
23 Apr 2023
International Journal For Multidisciplinary Research | VOL. 5

A novel framework for automatic caption and audio generation
Chaitanya Kulkarni ... Shruthi S
Materials Today: Proceedings | VOL. 65
Chaitanya Kulkarni, et. al.Chaitanya Kulkarni ... Shruthi S
01 Jan 2021
Materials Today: Proceedings | VOL. 65

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Automatic Audio and Image Caption Generation with Deep Learning

Abstract

Talk to us

Similar Papers

More From: Shanlax International Journal of Arts, Science and Humanities