Restricted Boltzmann Machine-Based Voice Conversion for Nonparallel Corpus

Ki-Seung Lee

doi:10.1109/lsp.2017.2713412

Ki-Seung Lee

https://doi.org/10.1109/lsp.2017.2713412

Copy DOI

Export

Save

Cite

Abstract
Full-Text
Similar Papers

Abstract

Listen

A large amount of parallel training corpus is necessary for robust, high-quality voice conversion. However, such parallel data may not always be available. This letter presents a new voice conversion method that needs no parallel speech corpus, and adopts a restricted Boltzmann machine (RBM) to represent the distribution of the spectral features derived from a target speaker. A linear transformation was employed to convert the spectral and delta features. A conversion function was obtained by maximizing the conditional probability density function with respect to the target RBM. A feasibility test was carried out on the OGI VOICES corpus. Results from the subjective listening tests and the objective results both showed that the proposed method outperforms the conventional GMM-based method.

Full Text