Abstract

Articulatory compensations in response to real-time formant perturbation have revealed that auditory feedback plays an important role in speech production. However, these compensatory responses were at most 40% for formant shifts and varied depending on vowel type and subjects. Although previous formant perturbation studies have been done using linear predictive coding (LPC), it is known that the estimation accuracy for low vowels and female speech would be degraded due to a glottal source-vocal tract interaction. To improve the accuracy, we have developed a real-time robust formant tracking system using phase equalization-based autoregressive exogenous (PEAR) model which utilizes the glottal source signals measured by electroglottography. In this study, we compared compensatory responses to real-time formant perturbation using PEAR and LPC. Eleven Japanese subjects (seven females) read a Japanese mora (/hi/ or /he/) with headphones. The first two formant frequencies were altered. Results showed that compensatory responses using PEAR were significantly larger than LPC. Moreover, naturalness of altered speech sounds was improved by PEAR. This indicates that improving speech sound naturalness by PEAR led to larger compensatory responses. Therefore, our system would be useful to understand the auditory feedback mechanisms in more detail.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call