Abstract
Most of the voice conversion (VC) researches have used parallel training corpora to train the conversion function. However, in practice it is not always possible to gather parallel corpora, so the need for non-parallel training methods arises. As a successful non-parallel method, nearest neighbour search step and a conversion step alignment method (INCA) algorithm has attracted a lot of attention in recent years. In this study, the authors propose a new method of non-parallel VC which is based on the INCA algorithm. The authors' method effectively solves the initialisation problem of INCA algorithm. Their proposed initialisation for INCA is done with alignment of Gaussian mixture models (GMM) using universal background model. Results of objective and subjective experiments determined that the authors' proposed method improves the INCA algorithm. It is observed that this superiority holds for different sizes of training material from 10 to 50 training sentences. In terms of mean opinion score, the authors' method scores 0.25 higher in the case of quality and 0.2 higher in the case of similarity to the target speaker compared with traditional INCA. It seems that the authors' proposed method is a suitable frame alignment method for non-parallel corpora in VC task.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.