Abstract

The basic idea of this paper is to design an alternative voice conversion technique using support vector machine (SVM) as a regression tool that, converts the voice of a source speaker to specific standard target speaker. A nonlinear mapping function between the parameters for the acoustic features of the two speakers has been captured in our work. The vocal tract characteristics have been represented by the line spectral frequencies (LSFs). The kernel induced feature space using radial basis function network type SVM with Gaussian basis function have been used in our work. The codebook based technique has been used to modify the intonation characteristic (pitch contour). Mapping of the pitch contour has been achieved at the word level by associating the codebooks derived from the pitch contours of the source and the target speakers. The speech signals for the desired target speaker have been synthesized using the transformed LSFs along with the modified pitch contour and evaluated using both the subjective and the listening tests. The results signify that the proposed model improves the voice conversion performance in terms of capturing the speaker’s identity. However, the performance can further be improved by suitably modifying various user defined parameters used in regression analysis and using more training LSF vectors in the training stage.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.