Improvement of phone recognition accuracy using speech mode classification

Kumud Tripathi,K Sreenivasa Rao

doi:10.1007/s10772-017-9483-4

Abstract

In this work, we have developed a speech mode classification model for improving the performance of phone recognition system (PRS). In this paper, we have explored vocal tract system, excitation source and prosodic features for development of speech mode classification (SMC) model. These features are extracted from voiced regions of a speech signal. In this study, conversation, extempore, and read speech are considered as three different modes of speech. The vocal tract component of speech is extracted using Mel-frequency cepstral coefficients (MFCCs). The excitation source features are captured through Mel power differences of spectrum in sub-bands (MPDSS) and residual Mel-frequency cepstral coefficients (RMFCCs) of the speech signal. The prosody information is extracted from pitch and intensity. Speech mode classification models are developed using above described features independently, and in fusion. The experiments carried out on Bengali speech corpus to analyze the accuracy of the speech mode classification model using the artificial neural network (ANN), naive Bayes, support vector machines (SVMs) and k-nearest neighbor (KNN). We proposed four classification models which are combined using maximum voting approach for optimal performance. From the results, it is observed that speech mode classification model developed using the fusion of vocal tract system, excitation source and prosodic features of speech, yields the best performance of 98%. Finally, the proposed speech mode classifier is integrated to the PRS, and the accuracy of phone recognition system is observed to be improved by 11.08%.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Improvement of phone recognition accuracy using speech mode classification

Abstract

Talk to us

Similar Papers

More From: International Journal of Speech Technology

Lead the way for us

Journal: International Journal of Speech Technology	Publication Date: Dec 7, 2017
Citations: 10

Similar Papers

Source and system features for phone recognition
K E Manjunath ... K Sreenivasa Rao
International Journal of Speech Technology | VOL. 18
K E Manjunath, et. al.K E Manjunath ... K Sreenivasa Rao
09 Dec 2014
International Journal of Speech Technology | VOL. 18

Multilingual and multimode phone recognition system for Indian languages
Kumud Tripathi ... K Sreenivasa Rao
Speech Communication | VOL. 119
Kumud Tripathi, et. al.Kumud Tripathi ... K Sreenivasa Rao
26 Feb 2020
Speech Communication | VOL. 119

Articulatory and excitation source features for speech recognition in read, extempore and conversation modes
K E Manjunath ... K Sreenivasa Rao
International Journal of Speech Technology | VOL. 19
K E Manjunath, et. al.K E Manjunath ... K Sreenivasa Rao
11 Dec 2015
International Journal of Speech Technology | VOL. 19

Analyzing RMFCC Feature for Dialect Identification in Ao, an Under-Resourced Language
Moakala Tzudir ... Shikha Baghel
-
Moakala Tzudir, et. al.Moakala Tzudir ... Shikha Baghel
24 May 2022
24 May 2022

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Improvement of phone recognition accuracy using speech mode classification

Abstract

Talk to us

Similar Papers

More From: International Journal of Speech Technology