Abstract

Speech disentanglement aims to decompose independent causal factors of speech signals into separate codes. Perfect disentanglement benefits to a broad range of speech processing tasks. This paper presents a simple but effective disentanglement approach based on cycle consistency loss and random factor substitution. This leads to a novel random cycle (RC) loss that enforces analysis-and-resynthesis consistency, a main principle of reductionism. We theoretically demonstrate that the proposed RC loss can achieve independent codes if well optimized, which in turn leads to superior disentanglement when combined with information bottleneck (IB). Extensive simulation experiments were conducted to understand the properties of the RC loss, and experimental results on voice conversion further demonstrate the practical merit of the proposal. Source code and audio samples can be found on the webpage http://rc.cslt.org.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call