Abstract
Background/Objectives: Automatic speech recognition (ASR) benefits human beings in many useful applications. Various ASR systems exhibiting good performance have been developed for normal speakers. The speech produced by a voice disordered patient is not like a normal speaker due to irregular vibration and incomplete closure of vocal fold. Therefore, an investigation is required by exploring the different speech features to develop an ASR system which can perform well for both pathological and normal speakers. Methods: In this paper, we proposed an automatic speech recognition system using Hidden Markov Model Toolkit (HTK) for normal and pathology voice. Four techniques are applied for feature extraction; Mel Frequency Cepstral Coefficient (MFCC), Perceptual Linear Prediction (PLP), RelAtiveSpecTrA - Perceptual Linear Predictive (RASTA-PLP), and linear prediction coefficients (LPC). The database that used to evaluate the performance of the developed system; includes a total of 297 speakers 121 of them were normal speakers and the remaining containing five types of vocal fold disorders. Findings: Experimental results show that the developed system gives good accuracies for normal and pathology voice. The highest accuracy of 94.44 % with a word error rate 5.55% is achieved in case of normal voice, and 88.63 % with a word error rate 11.63 % in case of pathology voice. Fuzzy logic controller is proposed to automatically segmentation the normal and disorders voice.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.