Development of Speaker-Independent Automatic Speech Recognition System for Kannada Language

Praveen Kumar,H S Jayanna

doi:10.17485/ijst/v15i8.2322

Praveen Kumar, H S Jayanna

Open Access

PDF Available

https://doi.org/10.17485/ijst/v15i8.2322

Copy DOI

Export

Save

Cite

Journal: Indian Journal of Science and Technology	Publication Date: Feb 27, 2022
Citations: 2	License type: cc-by

Abstract
Highlights/Summary
Full-Text PDF
Similar Papers

Abstract

Listen

Objectives: The primary goal is to address attempts to establish a Continuous Speech Recognition (CSR) framework for recognising continuous speech in Kannada. It is a difficult challenge to deal with a local language such as Kannada, which lacks the resources of a single language database. Methods: Modelling techniques such as monophone, triphone, deep neural network (DNN)-hidden Markov model (HMM) and Gaussian Mixture Model (GMM)- HMM-based models were implemented in Kaldi toolkit and used for continuous Kannada speech recognition (CKSR). To extract feature vectors from speech data, the Mel frequency Cepstral (MFCC) coefficient technique is used. The continuous Kannada speech database consists of 2800 speakers (1680 males and 1120 females) belong to the age group 8 years to 80 years. The training and testing data are in the ratio 80:20. In this paper the hybrid modelling techniques are implemented to recognize the spoken words. Findings: The model efficiency is determined based on the word error rate (WER) and the obtained results are assessed with the well-known datasets such as TIMIT and Aurora-4. This study found that using Kaldi-based features ex- traction recipes for monophone, triphone, DNN-HMM and GMM-HMM acoustic models had a word error rate (WER) of 8.23%, 5.23%, 4.05% and 4.64% respectively. The experimental results suggest that the rate of recognition of Kannada speech data has increased higher than that of state-of-the-art databases. Novelty : We propose a novel automatic speech recognition system for Kannada language. The main reason for developing the automatic speech recognition system for Kannada language is that there are only limited sources of standard continuous Kannada speech are available. We created large vocabulary Kannada database. We implemented monophone, triphone, Subspace Gaussian mixture model (SGMM) and hybrid modelling techniques to develop the automatic speech recognition system for Kannada language. Keywords: DNN; Continuous speech; HMM; Kannada dialect; Kaldi toolkit; monophone; triphone; WER

Highlights

The effective research into Kannada SR is more essentially needed
This work sets up a Continuous Speech Recognition (CSR) network for the Kannada language using phoneme modelling, where each phoneme is represented by a 5-state hidden Markov model (HMM) and each state is represented by a Gaussian Mixture Model (GMM)
The findings reveal that the SR systems produce a phone error rate (PER) of 24.21% and a word error rate (WER) of 4.12% respectively

Summary

Introduction

The effective research into Kannada SR is more essentially needed. This work sets up a CSR network for the Kannada language using phoneme modelling, where each phoneme is represented by a 5-state HMM and each state is represented by a GMM. It can be very useful to digitize old palm- leaf manuscript documents by someone reading it Such efforts will help to contribute the research for the development of the SR system for the Kannada language. In[9], the authors presented their work on the building of an LVCSR system for Tamil dialect using DNN They used 8 long stretches of Tamil speech collected from 30 speakers with a lexicon size of 13,984 words, of which 5 hours of learning was used for training. The extensive literature survey concludes that work on CKSR is not remarkable This made us conduct some tests by developing our database of 2800 speakers gathered throughout the state of Karnataka in the real-world conditions, we would like to check the behaviour of state-of-the-art techniques for continuous Kannada speech. According to the speech data the phoneme level lexicon is built

Kannada phoneme characteristics

Data collection and preparation

Feature Extraction

Language Model (LM)

Acoustic Model (AM)

Monophone model generation

Creating triphone models

GMM-HMM Modelling

DNN-HMM Modelling

Training and Testing

Experiment and Result Analysis

Findings

Conclusions

Full Text

Published Version (Free)

View/Download pdf

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

Development of Speaker-Independent Automatic Speech Recognition System for Kannada Language

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers

More From: Indian Journal of Science and Technology

Lead the way for us

Similar Papers

Continuous Kannada Speech Recognition System Under Degraded Condition
P S Praveen Kumar ... H S Jayanna
Circuits, Systems, and Signal Processing | VOL. 39
P S Praveen Kumar, et. al.P S Praveen Kumar ... H S Jayanna
15 Jul 2019
Circuits, Systems, and Signal Processing | VOL. 39

Performance Analysis of Hybrid Automatic Continuous Speech Recognition Framework for Kannada Dialect
P S Praveen Kumar ... H S Jayanna
-
P S Praveen Kumar, et. al.P S Praveen Kumar ... H S Jayanna
01 Jul 2019
01 Jul 2019

Performance Analysis of various Front-end and Back End Amalgamations for Noise-robust DNN-based ASR
Mohit Dua ... Vinam Agrawal
Recent Advances in Computer Science and Communications | VOL. 14
Mohit Dua, et. al.Mohit Dua ... Vinam Agrawal
01 Dec 2021
Recent Advances in Computer Science and Communications | VOL. 14

Usage of Combinational Acoustic Models (DNN-HMM and SGMM) and Identifying the Impact of Language Models in Sinhala Speech Recognition
Buddhi Gamage ... Thilini Nadungodage
-
Buddhi Gamage, et. al.Buddhi Gamage ... Thilini Nadungodage
04 Nov 2020
04 Nov 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

Development of Speaker-Independent Automatic Speech Recognition System for Kannada Language

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers

More From: Indian Journal of Science and Technology