Direct Noisy Speech Modeling for Noisy-To-Noisy Voice Conversion

Chao Xie,Wen-Chin Huang,Tomoki Toda,Yi-Chiao Wu,Patrick Lumban Tobing

doi:10.1109/icassp43922.2022.9747894

Abstract

Beyond the conventional voice conversion (VC) where the speaker information is converted without altering the linguistic content, the background sounds are informative and need to be retained in some real-world scenarios, such as VC in movie/video and VC in music where the voice is entangled with background sounds. As a new VC framework, we have developed a noisy-to-noisy (N2N) VC framework to convert the speaker's identity while preserving the background sounds. Although our framework consisting of a denoising module and a VC module well handles the background sounds, the VC module is sensitive to the distortion caused by the denoising module. To address this distortion issue, in this paper we propose the improved VC module to directly model the noisy speech waveform while controlling the background sounds. The experimental results have demonstrated that our improved framework significantly outperforms the previous one and achieves an acceptable score in terms of naturalness, while reaching comparable similarity performance to the upper bound of our framework.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Direct Noisy Speech Modeling for Noisy-To-Noisy Voice Conversion

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Evaluation of eigenvoice conversion based on Gaussian mixture model
Yamato Ohtani ... Kiyohiro Shikano
The Journal of the Acoustical Society of America | VOL. 120
Yamato Ohtani, et. al.Yamato Ohtani ... Kiyohiro Shikano
01 Nov 2006
The Journal of the Acoustical Society of America | VOL. 120

A noise-robust voice conversion method with controllable background sounds
Lele Chen ... Xiongwei Zhang
Complex & Intelligent Systems | VOL. 10
Lele Chen, et. al.Lele Chen ... Xiongwei Zhang
29 Feb 2024
Complex & Intelligent Systems | VOL. 10

Multi-target Voice Conversion without Parallel Data by Adversarially Learning Disentangled Audio Representations
Ju-Chieh Chou ... Hung-Yi Lee
-
Ju-Chieh Chou, et. al.Ju-Chieh Chou ... Hung-Yi Lee
02 Sep 2018
02 Sep 2018

Joint Dictionary Learning-Based Non-Negative Matrix Factorization for Voice Conversion to Improve Speech Intelligibility After Oral Surgery.
Szu-Wei Fu ... Ying-Hui Lai
IEEE Transactions on Biomedical Engineering | VOL. 64
Szu-Wei Fu, et. al.Szu-Wei Fu ... Ying-Hui Lai
01 Nov 2017
IEEE Transactions on Biomedical Engineering | VOL. 64

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Direct Noisy Speech Modeling for Noisy-To-Noisy Voice Conversion

Abstract

Talk to us

Similar Papers