Abstract

Although a considerable number of studies have been focused on the analysis of pathological voices using conventional parameters such as jitter, shimmer, and signal-to-noise ratio (SNR), these parameters have been found to be sensitive to variations in pitch extraction algorithm and cannot analyze severely disordered voice signals which exhibit irregular or aperiodic waveforms. In this paper, higher-order statistics (HOSs) analysis, which is independent of pitch period, is derived from linear predictive coding (LPC) residuals to describe breathy and rough voices. Recordings of a sustained /a/ from 23 individuals with breathy voices and 30 individuals with rough voices were collected from the disordered voice database distributed by the Japanese Society of Logopedics and Phoniatrics. We extracted conventional parameters as well as the HOS-based parameters such as the normalized skewness and the normalized kurtosis. On the other hand, we calculated HOS-based parameters from the LPC residual domain. The results showed that the HOS-based parameters and the HOS-based parameters estimated from the LPC residual are different for rough and breathy voices. Conventional parameters were not distinctive for these voices. Classification and regression tree (CART) was used to combine multiple parameters and to classify breathy and rough voices. Using the HOS-based parameters, the CART achieved an accuracy of 85.0% with the optimal decision tree generated by means of the normalized skewness and kurtosis. When the HOS-based parameters using LPC residual were used, the optimal decision tree was 88.7% accurate and the variances of the normalized skewness and kurtosis were included.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call