Abstract

A novel algorithm for voice conversion is proposed in this paper. The mapping function of spectral vectors of the source and target speakers is calculated by the Canonical Correlation Analysis (CCA) estimation based on Gaussian mixture models. Since the spectral envelope feature remains a majority of second order statistical information contained in speech after Linear Prediction Coding (LPC) analysis, the CCA method is more suitable for spectral conversion than Minimum Mean Square Error (MMSE) because CCA explicitly considers the variance of each component of the spectral vectors during conversion procedure. Both objective evaluations and subjective listening tests are conducted. The experimental results demonstrate that the proposed scheme can achieve better performance than the previous method which uses MMSE estimation criterion.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call