Experimental Analysis on Performance of Speech Utterance recognition using AI Models

Srikanth G N, M K Venkatesha

doi:10.52783/tjjpt.v44.i5.2511

Srikanth G N, M K Venkatesha

Open Access

PDF Available

https://doi.org/10.52783/tjjpt.v44.i5.2511

Copy DOI

Export

Save

Cite

Abstract
Full-Text PDF
Similar Papers

Abstract

Listen

In Automated speech recognition of the system performance is crucial and important to satisfy multiple requirements of HMI and, more recently, even in IoT-related applications as well. Concurrently, there has been an increase in demand for detecting strong critical features derived from speech utterances. This paper presents a performance of the developed machine learning algorithms with respect to audio digit speech recognition and classification. The prepared dataset contains a free range of words (from 1 to 10) from speakers of different age groups. The Audacity software used for preprocessing the audio files that includes removal of noise included in the signal and trimming the silence on either side of the word utterance. audio signal sampled at fs = 48kHz.We have developed four AI Models to recognise the word utterances. Audio signals are processed separately and derived two unique feature sets that includes statistical features set and singular values by performing SVD related to word utterances. The cepstral values for each utterance are obtained from state-of-the-art MFCC. Variance-covariance matrix is calculated from the generated MFCC matrix. The diagonal values which form the variance are recorded and denoted as feature set-1 for the word utterance and inputted to the machine learning algorithms. Performance matrices of the developed models are recorded. To keep the computational bottleneck associated with the use of feature sets to minimum, dimensionality reduction is carried out by applying singular value decomposition to the extracted MFCC matrix. The derived set of singular values considered as feature set-2 is used to train and test the developed AI models with a ratio of 70:30. We presented and discussed the performance and results produced by MLP, KNN, SVM, Random Forest algorithms. In comparison, MLP and Random Forest were found to show excellent performance on both feature sets with 100% training accuracy and 99% test accuracy.

Full Text