
In this paper, a voice conversion approach that combines two distinct ideas is proposed to improve the converted-voice quality. The first idea is to map spectral features, e.g. discrete cepstrum coefficients (DCC), with segmental Gaussian mixture models (GMMs). That is, a single GMM of a large number of mixture components is replaced here with several voice-content specific GMMs each consisting of much fewer mixture components. In addition, the second idea is to find a frame, of spectral features near to the mapped feature vector, from the target-speaker frame pool corresponding to the segment class as the input frame belongs to. Both ideas are intended to alleviate the problem encountered by a traditional GMM based conversion method, i.e. converted spectral envelopes are usually over smoothed. To apply the first idea to implement an on-line voice conversion system, we have proposed an automatic GMM selection algorithm based on dynamic programming (DP). Furthermore, as pointed out by previous researchers, mapping with a single selected Gaussian probability density function (PDF) instead of a combination of several Gaussian PDFs is helpful to obtain better converted-voice quality. Therefore, we have also proposed a Gaussian PDF selection algorithm and integrated it into our system. As to the implementation of the second idea, an algorithm based on DP is adopted which will consider both frame matching and connecting distances. For evaluating the performance of the two ideas studied here, three voice conversion systems are constructed, and used to conduct listening tests. The results of the tests show that the system with the two ideas combined can indeed obtain much improved voice quality besides improvement in timbre similarity.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.