Abstract

Assessment of the quality of disordered speech due to dysarthria, laryngeal diseases, stroke or other impairments is important for feedback to keep track of the development of patients pathology. Methods based on perceptual evaluation of speech quality are accurate and robust but expensive and time consuming. This study focuses on the development of a distance metric for disordered speech quality assessment based on signal processing techniques which is inexpensive, fast, affordable and easy to use. Previous efforts from our group have shown this can be achieved with short-term Mel frequency cepstral coefficients (MFCC) along with dynamic time warping (DTW) and Itakura-Saito distortion measures. Disordered speech is characterized by several long time signal variations such as increase of hoarseness in voice, speaking rate variability, word closure variability, presence of high frequency noise components, large pitch period and peak amplitude variations. An improved distortion metric is developed which accounts for these characteristics by augmenting the feature set with measures such as jitter, shimmer, amplitude perturbation quotient (APQ), pitch perturbation quotient (PPQ), harmonics to noise ratio (HNR) and uses DTW for alignment. Initial results with a speech verification database are promising while the full system development is underway.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.