Abstract

A speech synthesizer uses a digital waveguide network to simulate operation of the human pharynx on acoustic signals. One end of the digital waveguide network is connected to a glottal signal source, and another end has a signal filter simulating operation of the acoustic interface at a person's lips. The digital waveguide network has sets of waveguide sections connected in series by junctions, each waveguide section including two digital delay lines running parallel to each other for propagating signals in opposite directions. Each waveguide junction has associated reflection and propagation coefficients. A parameter library that stores sets of glottal source and waveguide junction control parameters for generating corresponding sets of predefined speech signals. The waveguide junction control parameters cause the digital waveguide network to simulate operation of an acoustic tube with a shape corresponding to that of a human pharynx while producing predefined speech sounds. An articulation controller operates the glottal signal source and the digital waveguide network using a sequence of selected sets of said control parameters, thereby causing the synthesizer to generate a specified sequence of speech signals. In a preferred embodiment, the digital waveguide network has three interconnected network branches for simulating operation of the lower pharynx, the oropharynx and the nasopharynx. To generate speech signals corresponding to fricative consonants, the speech synthesizer has noise signal injectors positioned at various points along the digital waveguide network.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call