Abstract
For perceptual and acoustical assessment of voice disorder, a commonly asked question is about what type of speech samples should be examined and analyzed. While sustained vowels are simple, easy to produce, and suitable for direct comparison among patients, continuous speech is ecologically more valid and better represents the daily use of voice. This study makes an attempt to compare the effectiveness and contribution of sustained vowels and continuous speech in predicting voice disorder severity. To measure voice disorder severity for sustained vowels and continuous speech utterances, two three-class (mild, moderate and severe) classifiers are trained separately with different acoustic features sets. The acoustic features include measures on perturbation, prosody, noise, spectrum and cepstrum. Experimental results on a Cantonese speech database of disordered voice show that sustained vowels and continuous speech are complementary to each other. Relating to their different production complexities, these two types of utterances may exhibit significantly different degrees of voice abnormality, and hence be assigned different severity levels. This suggests that when predicting the subject-level severity based on the utterance-level measures, the speech types of individual utterances must be taken into account. Specifically a posterior fusion approach is proposed for generating subject-level prediction result. In the three-class (mild, moderate and severe) subject-level disorder severity prediction task, the proposed method achieves an accuracy of 75% and AUC (Area Under Curve of ROC) of 0. S62.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have