Novel approach to acoustical voice analysis using artificial neural networks.

R Schönweiler ,Peter Wübbelt ,Markus Heß ,M Ptok

doi:10.1007/s101620010020

Abstract

Perceptual rating scales are widely used for the assessment of voice quality. These ratings may be influenced by the individual experience of the listener. Thus, researchers have turned to acoustical measures which may eventually correlate with voice quality. In this study we tested whether multivariate statistics, combined with artificial neural networks, could identify patterns of acoustic voice parameters corresponding to a widely used perceptual rating scale. In a multicenter study with 31 raters, voice samples of 117 individuals with or without voice disorders were perceptually rated. The RBH index, consisting of a 4-point scale of roughness, breathiness, and hoarseness, was used. Voice samples were then analyzed with an acoustical feature extraction and classified using amultivariate regression tree analysis with the perceptual ratings as a priori information. Artificial neural networks were trained to selected acoustic parameters having high "relative importance" in the regression trees. Mean classification accuracies were around 30% with topographic feature maps (trained with Learning Vector Quantization algorithm) and 65-85% with feedforward networks (trained with RProp algorithm). Based on the best-fitting results with feedforward networks, a classification system (computer program) consisting of 50 simultaneous working networks was developed. Using this program, the classification matched 40% of the a prori values in both R and B domains. In 65% they matched at least in one domain. These accuracies are within the range reported by other authors using artificial neural networks in biology and clinical medicine. Thus, the results encourage further research of feedforward networks for acoustic voice analysis.

Full Text