Abstract

In traditional vocal tract length normalization (VTLN) method, the details of the differences of spectral among speakers cannot be modeled, because only a simple vocal tract length factor is regarded as the absolutely indicator of the speaker specific attribute, according to the assumption of lossless multi-tube vocal tract model. In this paper, the piece-wise frequency warping function is adopted to describe the speaker specific character in detail. With an appropriate partition of frequency axis, the differences of spectral can be removed well. Due to the model-independent warping function, this method is proved to be a quite fast adaptation technique, and especially suitable for the unsupervised adaptation.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call