Abstract

A model is created for blind people that can guide and support them while traveling on the highways just with the help of a smartphone application. This can be accomplished by first converting the scene in front of the user into text and then converting text into voice output. Then a method for the generation of image legends based on deep neural networks. With an image as an entry, the method can display an English sentence describing the contents of the image. The user first provides a voice command, then a quick snapshot is captured by the camera or webcam. This image is then fed as input to the image caption generator template that generates a caption for the image. Next, this caption text is converted to speech, which gives rise to a voice message on the description of the image.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call