Again-VC: A One-Shot Voice Conversion Using Activation Guidance and Adaptive Instance Normalization

Yen-Hao Chen,Da-Yi Wu,Tsung-Han Wu,Hung-Yi Lee

doi:10.1109/icassp39728.2021.9414257

Abstract

Recently, voice conversion (VC) has been widely studied. Many VC systems use disentangle-based learning techniques to separate the speaker and the linguistic content information from a speech signal. Subsequently, they convert the voice by changing the speaker information to that of the target speaker. To prevent the speaker information from leaking into the content embeddings, previous works either reduce the dimension or quantize the content embedding as a strong information bottleneck. These mechanisms somehow hurt the synthesis quality. In this work, we propose AGAIN-VC, an innovative VC system using Activation Guidance and Adaptive Instance Normalization. AGAIN-VC is an auto-encoder-based model, comprising of a single encoder and a decoder. With a proper activation as an information bottleneck on content embeddings, the trade-off between the synthesis quality and the speaker similarity of the converted speech is improved drastically. This one-shot VC system obtains the best performance regardless of the subjective or objective evaluations.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Again-VC: A One-Shot Voice Conversion Using Activation Guidance and Adaptive Instance Normalization

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

A hybrid CNN-LSTM model with adaptive instance normalization for one shot singing voice conversion
Assila Yousuf ... David Solomon George
AIMS Electronics and Electrical Engineering | VOL. 8
Assila Yousuf, et. al.Assila Yousuf ... David Solomon George
01 Jan 2024
AIMS Electronics and Electrical Engineering | VOL. 8

Comparing the performance of classic voice-driven assistive systems for dysarthric speech
Wei-Zhong Zheng ... Ying-Hui Lai
Biomedical Signal Processing and Control | VOL. 81
Wei-Zhong Zheng, et. al.Wei-Zhong Zheng ... Ying-Hui Lai
07 Dec 2022
Biomedical Signal Processing and Control | VOL. 81

Speech naturalness improvement via $$\mathrm {\epsilon }$$ ϵ -closed extended vectors sets in voice conversion systems
Mohammad Javad Jannati ... Abolfazl Razi
Multidimensional Systems and Signal Processing | VOL. 29
Mohammad Javad Jannati, et. al.Mohammad Javad Jannati ... Abolfazl Razi
12 Jan 2017
Multidimensional Systems and Signal Processing | VOL. 29

Non-Parallel Voice Conversion System With WaveNet Vocoder and Collapsed Speech Suppression
Yi-Chiao Wu ... Kazuhiro Kobayashi
IEEE Access | VOL. 8
Yi-Chiao Wu, et. al.Yi-Chiao Wu ... Kazuhiro Kobayashi
01 Jan 2020
IEEE Access | VOL. 8

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Again-VC: A One-Shot Voice Conversion Using Activation Guidance and Adaptive Instance Normalization

Abstract

Talk to us

Similar Papers