Abstract

Visually and hearing impaired people face troubles due to inaccessible infrastructure and social challenges in daily life. To increase the life quality of those people, we report a portable and user-friendly smartphone-based platform capable of generating captions and text descriptions, including the option of a narrator, using image obtained from a smartphone camera. Image captioning is to generate a sentence to describe the visual content of an image in natural language and has attracted an increasing amount of attention in the fields of computer vision and natural language processing due to its potential applications. Generating image captions with proper linguistic properties is a challenging task as it needs to combine advanced level of image understanding algorithms with natural language processing methods. In this study, we propose to use Long Short-Term Memory (LSTM) model to generate a caption after images are trained using VGG16 deep learning architecture. The visual attributes of images are extracted with the VGG16, which conveys richer content, and then they are fed into the LSTM model for caption generation. This system is integrated with our custom- designed Android application, named as Eye of Horus which transfers the images from smartphone to the remote server via a cloud system, and displays the captions after the images are processed with the proposed captioning approach. The results show that the integrated platform has great potential to be used for image captioning by visually and hearing impaired people with advantages such as portability, simple operation and rapid response.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.