Automatic Recognition of Pathological Phoneme Production

Robert Wielgat,Tomasz Woźniak,Stanisław Grabias,Tomasz P Zieliński,Daniel Król

doi:10.1159/000170083

Abstract

Objective: Proper diagnosis and therapy of pathological pronunciation of phonemes play an important role in modern logopedics. To enhance the efficiency of diagnosis and therapy an automatic recognition of pathological phoneme pronunciation is addressed in this paper. The authors focus on the therapy of phoneme substitution disorders. Patients and Methods: Recognized speech samples come from speech-impaired Polish children and partially from persons imitating speech disorders. Recognized speech disorders were substitutions in pairs {s, Ê‚}, {É•, Ê‚}, {Ê¦, tÊ‚+}, {Ê¨, tÊ‚+}, {Ê£, dÊ+ }, and {Ê¥, dÊ+ } embedded in Polish carrier words. In order to detect substitutions in the recognized words, recently proposed human factor cepstral coefficients (HFCC) have been implemented. Efficiency of the HFCC approach was compared to the application of standard mel-frequency cepstral coefficients (MFCC) as a feature vector. Both dynamic time warping (DTW), working on whole words or embedded phoneme patterns, and hidden Markov models (HMM) were used as classifiers. The HMM classifier was based on whole-word models as well as phoneme models. Results present a comparative analysis of DTW and HMM methods. Conclusions: The superiority of HFCC features over those of MFCC was demonstrated. Results obtained by DTW methods, mainly by modified phoneme-based DTW classifier, were slightly better in comparison with the HMM classifier. Results obtained for the detection of substitution in pairs {s, Ê‚}, {Ê¦, tÊ‚+}, {Ê£, Ê£+ } are very promising. The methods developed for these cases can be integrated into computer systems for speech therapy. For substitutions in pairs {É•, Ê‚}, {Ê¨, tÊ‚+}, {Ê¥, dÊ+ } further research is necessary.

Full Text