Statistical Models for Dealing with Discontinuity of Fundamental Frequency

Kai Yu

doi:10.1007/978-3-662-45258-5_9

Kai Yu

https://doi.org/10.1007/978-3-662-45258-5_9

Copy DOI

Export

Save

Cite

Abstract
Full-Text
Similar Papers

Abstract

Listen

The accurate modelling of fundamental frequency, or F0, in HMM-based speech synthesis is a critical factor for achieving high quality speech. However, it is also difficult because F0 values are normally considered to depend on a binary voicing decision such that they are continuous in voiced regions and undefined in unvoiced regions. Namely, estimated F0 value is a discontinuous function of time, whose domain is partly continuous and partly discrete. This chapter investigates two statistical frameworks to deal with the discontinuity issue of F0. Discontinuous F0 modelling strictly defines probability of a random variable with discontinuous domain and model it directly. Awidely used approach within this framework is multi-space probability distribution (MSD). An alternative framework is continuous F0 modelling, where continuous F0 observations are assumed to always exist and voicing classification is modelled separately. Both theoretical and experimental comparisons of the two frameworks are given.

Full Text