Abstract

A statistical trajectory speech model is constructed where the targets for vocal tract resonances are represented as random vectors and where the mean vectors of the target distributions are estimated using a likelihood function for joint acoustic observation vectors. The target mean vectors can be estimated without formant data. To form the model, time-dependent filter parameter vectors based on time-dependent coarticulation parameters are constructed that are a function of the ordering and identity of the phones in the phone sequence in each speech utterance. The filter parameter vectors are also a function of the temporal extent of coarticulation and of the speaker's speaking effort.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call