End-to-End Speech Recognition of Tamil Language

A Nayeemulla Khan,A Shahina,Mohamed Hashim Changrampadi,M Badri Narayanan

doi:10.32604/iasc.2022.022021

A Nayeemulla Khan, A Shahina + Show 2 more

Open Access

https://doi.org/10.32604/iasc.2022.022021

Copy DOI

Journal: Intelligent Automation & Soft Computing	Publication Date: Jan 1, 2022
Citations: 12	License type: cc-by

Abstract

Research in speech recognition is progressing with numerous state-of-the-art results in recent times. However, relatively fewer research is being carried out in Automatic Speech Recognition (ASR) for languages with low resources. We present a method to develop speech recognition model with minimal resources using Mozilla DeepSpeech architecture. We have utilized freely available online computational resources for training, enabling similar approaches to be carried out for research in a low-resourced languages in a financially constrained environments. We also present novel ways to build an efficient language model from publicly available web resources to improve accuracy in ASR. The proposed ASR model gives the best result of 24.7% Word Error Rate (WER), compared to 55% WER by Google speech-to-text. We have also demonstrated a semi-supervised development of speech corpus using our trained ASR model, indicating a cost effective approach of building large vocabulary corpus for low resource language. The trained Tamil ASR model and the training sets are released in public domain and are available on GitHub.

Highlights

The recent advancement in Automatic Speech Recognition (ASR) in the past couple of years is commendable, surpassing even human perception
DeepSpeech architecture needs Graphics Processing Unit (GPU) resources to run the training in minimal time, GPU is selected as the hardware accelerator in Google Colaboratory (GC)
Even though GC has usage time limits while using GPU, checkpoints are saved at regular intervals, which are continued after the time limit is revoked

Summary

Introduction

The recent advancement in Automatic Speech Recognition (ASR) in the past couple of years is commendable, surpassing even human perception. We investigate the use of open-source speech recognition toolkits to build a speech recognition model for the Tamil language This developed pre-trained model will provide an out-of-the-box support for transfer learning for keyword spotting, isolated word recognition, etc. We present a novel approach to build a pre-trained model using low resources and substantially assist in developing a massive speech corpus using semi-supervised learning. To our knowledge, this is the first attempt to use the Common Voice dataset and release a pre-trained ASR model for Tamil language.

Related Works

Is Tamil a Low Resource Language?

Structure of Tamil Language and Its Challenges

Tamil ASR System Architecture

Speech Corpus

ASR Architecture

Language Model

Model Training and Results

Training Setup

Transfer Learning for Isolated Tamil Digit Recognition

Semi-Supervised Development of Speech Corpus

Conclusion

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

End-to-End Speech Recognition of Tamil Language

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Intelligent Automation & Soft Computing

Lead the way for us

Similar Papers

OkwuGbé: End-to-End Speech Recognition for Fon and Igbo
...
-
, et. al. ...
21 Oct 2021
21 Oct 2021

ETEH: Unified Attention-Based End-to-End ASR and KWS Architecture
Runyan Yang ... Yonghong Yan
IEEE/ACM Transactions on Audio, Speech, and Language Processing | VOL. 30
Runyan Yang, et. al.Runyan Yang ... Yonghong Yan
01 Jan 2021
IEEE/ACM Transactions on Audio, Speech, and Language Processing | VOL. 30

Development and comparison of ASR models using kaldi for noisy and enhanced kannada speech data
G Thimmaraja Yadava ... H S Jayanna
-
G Thimmaraja Yadava, et. al.G Thimmaraja Yadava ... H S Jayanna
01 Sep 2017
01 Sep 2017

Semantic language models for Automatic Speech Recognition
Ali Orkan Bayer ... Giuseppe Riccardi
-
Ali Orkan Bayer, et. al.Ali Orkan Bayer ... Giuseppe Riccardi
01 Dec 2014
01 Dec 2014

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

End-to-End Speech Recognition of Tamil Language

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Intelligent Automation &amp; Soft Computing

More From: Intelligent Automation & Soft Computing