Abstract

An ASR system is built for the Continuous Kannada Speech Recognition. The acoustic and language models are created with the help of the Kaldi toolkit. The speech database is created with the native male and female Kannada speakers. The 80% of collected speech data is used for training the acoustic models and 20% of speech database is used for the system testing. The Performance of the system is presented interms of Word Error Rate (WER). Wavelet Packet Decomposition along with Mel filter bank is used to achieve feature extraction. The proposed feature extraction performs slightly better than the conventional features such as MFCC, PLP interms of WRA and WER under uncontrolled conditions. For the speech corpus collected in Kannada Language, the proposed features shows an improvement in Word Recognition Accuracy (WRA) of 1.79% over baseline features.

Highlights

  • The frequent pauses between the speech sounds of a speech signal portrays its unique characteristic that distinguishes it from all other signals

  • The database consists of 3 sets for Kannada Language namely: isolated digits through (0-9), isolated words, Continuous Kannada Speech consisting of Spontaneous Spoken Kannada Sentences

  • The database consists of 3 sets for English Language namely: isolated digits (TIMIT) through (0-9), isolated words (TIMIT), Librispeech of Continuous English Speech

Read more

Summary

D J Ravi Vidyavardhaka College of Engineering

The acoustic and language models are created with the help of the Kaldi toolkit. The speech database is created with the native male and female Kannada speakers. The 75% of collected speech data is used for training the acoustic models and 25% of speech database is used for the system testing. The Performance of the system is presented interms of Word Error Rate (WER). The proposed feature extraction performs slightly better than the conventional features such as MFCC, PLP interms of WRA and WER under uncontrolled conditions. For the speech corpus collected in Kannada Language, the proposed features shows an improvement in WRA of 1.79% over baseline features

INTRODUCTION
RELATED WORKS
PROPOSED FEATURES
Theoretical Background of Wavelet Transforms
Mel Filter like WP Decomposition
PERFORMANCE ANALYSIS
DATABASE The Kannada speech Database consisting of isolated digits from
21 Acoustic Model
RESULTS
CONCLUSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call