Abstract

This paper proposes a generative adversarial training method for deep neural network (DNN)-based singing voice synthesis. The DNN-based approach has been used in statistical parametric singing voice synthesis and improved the naturalness of the synthesized singing voice [1]. Recently, generative adversarial networks (GANs) [2] have attracted significant attention in various machine learning research areas including speech synthesis [3]. GANs have achieved great success in modeling the distributions of complex data, and they have the potential to alleviate over-smoothing problem on the generated speech parameters in speech synthesis. In this paper, we propose a DNN-based singing voice synthesis system incorporating the GAN. Experimental results show that the proposed method outperforms the conventional method in the naturalness of the synthesized singing voice.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call