Optimizing Voice Conversion Network with Cycle Consistency Loss of Speaker Identity

Hongqiang Du,Lei Xie,Xiaohai Tian,Haizhou Li

doi:10.1109/slt48900.2021.9383567

Optimizing Voice Conversion Network with Cycle Consistency Loss of Speaker Identity

Hongqiang Du, Lei Xie + Show 2 more

Open Access

https://doi.org/10.1109/slt48900.2021.9383567

Copy DOI

Publication Date: Jan 19, 2021

Citations: 44

Affiliation: National University of Singapore, Northwestern Polytechnical University

#Speaker Identity #Proposed Training Scheme + Show 8 more

Abstract
Full-Text PDF
Similar Papers

Abstract

We propose a novel training scheme to optimize voice conversion network with a speaker identity loss function. The training scheme not only minimizes frame-level spectral loss, but also speaker identity loss. We introduce a cycle consistency loss that constrains the converted speech to maintain the same speaker identity as reference speech at utterance level. While the proposed training scheme is applicable to any voice conversion networks, we formulate the study under the average model voice conversion framework in this paper. Experiments conducted on CMU-ARCTIC and CSTR-VCTK corpus confirm that the proposed method outperforms baseline methods in terms of speaker similarity.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Similar Papers

Paper Title

Journal

Date

Author

View more papers

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.