Abstract
This paper aims at developing an HMM-based speech synthesis system capable of generating creaky voice in addition to modal voice. Generation of creaky voice is carried out by addressing two main issues, namely, an automatic prediction of creaky voice and appropriate modelling of the excitation signal of creaky voice. An automatic creaky voice detection method is proposed based on the analysis of variation of epoch parameters for different voicing regions. A neural network classifier is trained using the variances of epoch parameters for detection of creaky regions. A hybrid source model which is an extension of recently developed time-domain deterministic plus noise model is proposed for modelling creaky excitation signal. In the proposed hybrid source model, the pitch-synchronous analysis is performed on the creaky excitation signal of every phone. From the creaky residual frames of every phonetic class, the deterministic and noise components are estimated. The creaky deterministic components of all phonetic classes are stored in the database. The noise components are parameterized in terms of spectral and amplitude envelopes and are modelled by HMMs. During synthesis, the appropriate deterministic component is selected from the database, and the noise component is constructed from the parameters generated from HMMs. The creaky deterministic and noise components are pitch-synchronously overlap-added to produce the creaky excitation signal. Subjective evaluation results indicate that the incorporation of creaky voice has improved the naturalness of the synthetic speech of two male speakers, and the quality is slightly better than the basic time-domain deterministic plus noise model meant for only modal excitation.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.