Improving the intelligibility of dysarthric speech by modifying system parameters, retaining speaker's identity

M Saranya,Nagarajan Thangavelu,P Vijayalakshmi

doi:10.1109/icrtit.2012.6206799

Abstract

Dysarthria is a neuromotor impairment of speech that affects one or more subsystems involved in speech production. Such impairment is reflected in the acoustic characteristics of phonemes uttered by a dysarthric speaker. If such a speaker suffers from laryngeal dysfunction and improper articulation, then he/she may not be able to utter some/most of the phonemes properly. In our work, from the utterance of a dysarthric speaker, the poorly uttered phonemes are located and replaced with that of the normal speaker's speech signal. However, the resultant speech signal after concatenation doesn't sound natural due to the discontinuities, at the concatenation points in short-term energy, pitch period, and formant contour. In our work, the discontinuity at the concatenation point, in the short-term energy function is handled by smoothening the short-term energy of few frames before and after the concatenation point. Since, the pitch period in the replaced segment (phoneme) is considerably different from the dysarthric speaker's pitch period, the pitch period is adjusted to resemble the dysarthric speaker. The quality and naturalness of the utterance, after pitch modification, are considerably increased. The discontinuity in the formant contour is due to the reason that the co-articulation effect is absent since the replaced unit is taken from a different context. From the linear prediction analysis, the pole locations and their corresponding radii are adjusted based on the pole locations of adjacent phonemes. The quality and naturalness of speech signal, after all the three modifications, are found to be very close to the natural speech.

Full Text