Abstract
Understanding foreign languages can be challenging for individuals living in India's diverse linguistic landscapes. We propose a new technology that utilizes machine translation to address this issue, specifically focusing on speech recognition and synthesis. It aims to convert online video resources into Indian languages by integrating open-source technologies like text-to-speech (TTS), speech-to-text (STT) systems, and FFmpeg library to separate or augment audio and video. We used the whisper model, the application that can read up to 60 different Languages in the form of audio as input, and it transcripts the audio into text with segments of sentences based on timestamps. The sentence-based transcription generated by whisper is then translated into the desired language using Google Cloud translate_v2. Later, Each timestamp was individually converted into audio using the Google Cloud text-to-speech service, ensuring the audio fits inside the length of its respective timestamp. The individual audio segments are then augmented to generate the final audio in the desired language. Finally, the audio is attached to the original video, ensuring video-audio synchronization. The accuracy of the translation was verified by comparing the naturalness of the audio with general spoken language standards. This application benefits visually impaired individuals and those who cannot read text, providing them with a means to acquire knowledge in their native languages.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: International Journal of Innovative Science and Research Technology (IJISRT)
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.