Abstract

The images of documents contain the text and non-textual characters which are usually converted into voice or audio format using TTS (Text to Speech) systems. The blind individuals are supposed to benefit from this TTS innovation. According to the World Health Organization, India has 1.99% blind individuals. Hence, aiding the blind is necessary. In the proposed research, a machine learning based text-to-speech converter is proposed. First, a Raspberry Pi is embedded with proposed text to speech converter algorithm. Second, a camera is used to capture the images as input, which the TTS unit will receive. Third, a Raspberry Pi is equipped with a TTS unit, that proposed a captured image and the output of the TTS device is amplified using an audio amplifier. Fourth, the proposed signal is sent to the speaker. A reader enables the user to hear the text they have entered. It entails text extraction from the image and text-to-speech conversion. With a camera module and a Raspberry Pi, the OCR (optical character recognition) method is used to convert text to speech. The setup consists of a Raspberry Pi webcam interface. The audio output on the Raspberry Pi can be heard through speakers or headphones. A few seconds pass during the conversion. This device can make it easier for people who are visually challenged to read text from images. Experimental results show the significant improvement compare to the state-of-the-art text to speech conversion algorithm. Paper ended with future research direction.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call