A possible role of nonlinear source‐filter interaction in simulation of childlike speech.

Brad H Story

doi:10.1121/1.4784081

Abstract

Simulation (or synthesis) of natural‐sounding childlike speech has long been a challenge. This is likely due, at least in part, to an incomplete understanding of the nonlinear interaction of the voice source and the vocal tract filter. Speech production by children is typically characterized by a fairly high fundamental frequency of phonation and a short vocal tract length that produces high formant frequencies. Together, these two characteristics suggest that low‐numbered harmonics (including the fundamental frequency) may often, or even necessarily, be in close proximity to one or more of the formant frequencies. Such conditions may lead to a strong interaction of the acoustic pressures in the vocal tract and the glottal airflow, and possibly the vibration of the vocal folds. The purpose of this study was to use kinematic models of the vocal folds and vocal tract shape, scaled to approximately represent a 5‐year‐old child, to generate individual vowels and sentences. Waveshape and harmonic content of gl...

Full Text