Abstract
Speaker adaptation techniques can be classified as intra-lingual or cross-lingual depending on whether or not the source model and the target speaker employ the same language. Most of the work in this field has been focused on the first case, while the second one has been less explored. In this paper we address the cross-lingual paradigm in the framework of a HMM-based speech synthesis system by further developing a formerly proposed approach. This method is able to clone a given speaker into a different language by combining the linguistic structure and the acoustic characteristics of two HTS models. In this work, we discuss the extension of the adaptation procedure to some other source model parameters that were kept unmodified in the initial version, and compare the performance of both versions by means of subjective and objective tests. These results are also contrasted with those obtained by a KLD-based technique proposed in the literature for a similar purpose. While no significant preference for any of the versions of our method is observed, our approach clearly outperforms the KLD-based technique.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.