Abstract

Voice conversion (VC) is a process which modifies the speech signal produced by one source speaker so that it sounds like another target speaker. In this paper we compare two techniques for voice conversion. In the first technique, a conversion function based on Gaussian mixture model (GMM) is used for transforming the spectral envelope described by line spectral frequencies (LSF) parameters and the linear predictive coefficients (LPC) residuals or Mel frequency cepstral coefficients (MFCC) parameters. The second technique uses Pitch Synchronous Overlap Add (PSOLA) and resampling. The comparison between the two techniques is based on subjective evaluation, also objective evaluation such as mean squared error (MSE) and pitch estimation.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.