Abstract
An ASR system is built for the Continuous Kannada Speech Recognition. The acoustic and language models are created with the help of the Kaldi toolkit. The speech database is created with the native male and female Kannada speakers. The 80% of collected speech data is used for training the acoustic models and 20% of speech database is used for the system testing. The Performance of the system is presented interms of Word Error Rate (WER). Wavelet Packet Decomposition along with Mel filter bank is used to achieve feature extraction. The proposed feature extraction performs slightly better than the conventional features such as MFCC, PLP interms of WRA and WER under uncontrolled conditions. For the speech corpus collected in Kannada Language, the proposed features shows an improvement in Word Recognition Accuracy (WRA) of 1.79% over baseline features.
Highlights
The frequent pauses between the speech sounds of a speech signal portrays its unique characteristic that distinguishes it from all other signals
The database consists of 3 sets for Kannada Language namely: isolated digits through (0-9), isolated words, Continuous Kannada Speech consisting of Spontaneous Spoken Kannada Sentences
The database consists of 3 sets for English Language namely: isolated digits (TIMIT) through (0-9), isolated words (TIMIT), Librispeech of Continuous English Speech
Summary
The acoustic and language models are created with the help of the Kaldi toolkit. The speech database is created with the native male and female Kannada speakers. The 75% of collected speech data is used for training the acoustic models and 25% of speech database is used for the system testing. The Performance of the system is presented interms of Word Error Rate (WER). The proposed feature extraction performs slightly better than the conventional features such as MFCC, PLP interms of WRA and WER under uncontrolled conditions. For the speech corpus collected in Kannada Language, the proposed features shows an improvement in WRA of 1.79% over baseline features
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have