Abstract

To improve the performance of pathological voice detection and classification, gammatone spectral latitude (GTSL) features were proposed. GTSL features are inspired by the nonlinear phenomena produced from the human phonation, presenting explicit physiological meaning. The features combine with human auditory perception characteristics. GTSL features quantify the turbulent noise by the nonlinear compression of peak value and dynamic range of the spectrums in each frequency channel. For pathological voice detection, gammatone spectral latitude (GTSL) features fitted better with traditional machine learning algorithms than traditional nonlinear features and gammatone ceptral coefficients (GTCCs). In the classification between healthy, neuromuscular and structural voices, the proposed features achieved average accuracy of 99.6% in the Massachusetts Eye and Ear Infirmary (MEEI) database, which is 35.6% higher than other gammatone features. The accuracies in other database, Saarbruecken Voice Database (SVD) and Hospital Universitario Prłncipe de Asturias (HUPA), were 89.9% and 97.4% respectively. The experimental results indicate that, GTSL features can provide objective evaluation of voice diseases with low computational complexity and database dependency.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call