Improving the intelligibility of dysarthric speech using a time domain pitch synchronous-based approach

Mahesh Giri,Neela Rayavarapu

doi:10.11591/ijece.v13i4.pp4041-4051

Abstract

Dysarthria is a motor speech impairment that reduces the intelligibility of speech. Observations indicate that for different types of dysarthria, the fundamental frequency, intensity, and speech rate of speech are distinct from those of unimpaired speakers. Therefore, the proposed enhancement technique modifies these parameters so that they fall in the range for unimpaired speakers. The fundamental frequency and speech rate of dysarthric speech are modified using the time domain pitch synchronous overlap and add (TD-PSOLA) algorithm. Then its intensity is modified using the fast Fourier transform (FFT) and inverse fast Fourier transform (IFFT)-based approach. This technique is applied to impaired speech samples of ten dysarthric speakers. After enhancement, the intelligibility of impaired and enhanced dysarthric speech is evaluated. The change in the intelligibility of impaired and enhanced dysarthric speech is evaluated using the rating scale and word count methods. The improvement in intelligibility is significant for speakers whose original intelligibility was poor. In contrast, the improvement in intelligibility was minimal for speakers whose intelligibility was already high. According to the rating scale method, for diverse speakers, the change in intelligibility ranges from 9% to 53%. Whereas, according to the word count method, this change in intelligibility ranges from 0% to 53%.

Full Text