Abstract

Simulation (or synthesis) of natural‐sounding childlike speech has long been a challenge. This is likely due, at least in part, to an incomplete understanding of the nonlinear interaction of the voice source and the vocal tract filter. Speech production by children is typically characterized by a fairly high fundamental frequency of phonation and a short vocal tract length that produces high formant frequencies. Together, these two characteristics suggest that low‐numbered harmonics (including the fundamental frequency) may often, or even necessarily, be in close proximity to one or more of the formant frequencies. Such conditions may lead to a strong interaction of the acoustic pressures in the vocal tract and the glottal airflow, and possibly the vibration of the vocal folds. The purpose of this study was to use kinematic models of the vocal folds and vocal tract shape, scaled to approximately represent a 5‐year‐old child, to generate individual vowels and sentences. Waveshape and harmonic content of gl...

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call