Parallel data free singing voice conversion with cycle-consistent BEGAN

Assila Yousuf,David Solomon George

doi:10.1016/j.matpr.2022.01.169

Abstract

Singing voice conversion (SVC) is the method to modify the timbre of the source singer with the target singer while retaining the linguistic content. Recent studies are mainly focused with non-parallel training data, since it is difficult to obtain parallel training data in real life applications. In this paper, a parallel data-free SVC technique is proposed using Cycle-consistent Boundary Equilibrium Generative Adversarial Networks (CycleBEGAN) with gated convolutional neural networks (CNNs) and an identity-mapping loss. CycleBEGAN allows the learning of data distribution using both adversarial loss and cycle-consistency loss. Gated CNN and identity mapping loss ensures the sequential and hierarchical structures of information and preservation of linguistic information. This technique produces high quality converted singing voice without any time-alignment procedures and requires only a small amount of training data.

Full Text