Voice Cloning Using Transfer Learning with Audio Samples

Usman Nawaz,Usman Ahmed Raza,Amjad Farooq,Muhammad Junaid Iqbal,Ammara Tariq

doi:10.32350/umt-air.32.04

Abstract

Voice cloning refers to the artificial replication of a certain human voice. Several deep learning approaches were studied for voice cloning. After studying learning approaches, a cloning system was offered that creates natural-sounding audio samples within few seconds of source speech from the target speaker. From a speaker verification challenge to text-to-speech synthesis with multi-speaker capability, the current study used a transfer learning technique. In a zero-shot mode, this system creates speech sounds in the voices of various speakers, even individuals who were not seen during the training process. The current study used latent embedding’s to encode speaker-specific information, enabling additional model parameters to be pooled across all speakers. The speaker modelling stage was separated from voice synthesis by training a discrete speaker-discriminative encoder network. This is because networks require distinct types of input, disconnection enables each to be trained using separate datasets. When employed for zero-shot adaptability to unknown speakers, an embedding-based technique for voice cloning enhances speaker resemblance. Furthermore, it reduces computational resource needs which may be advantageous for use-cases requiring minimal resource deployment.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Voice Cloning Using Transfer Learning with Audio Samples

Abstract

Talk to us

Similar Papers

More From: UMT Artificial Intelligence Review

Lead the way for us

Journal: UMT Artificial Intelligence Review	Publication Date: Dec 20, 2023
License type: CC BY 4.0

Similar Papers

A Voice Cloning Method Based on the Improved HiFi-GAN Model.
Zeyu Qiu ... Yaxin Zhang
Computational Intelligence and Neuroscience | VOL. 2022
Zeyu Qiu, et. al.Zeyu Qiu ... Yaxin Zhang
11 Oct 2022
Computational Intelligence and Neuroscience | VOL. 2022

Voice Cloning Applied to Voice Disorders: a Study of Extreme Phonetic Content in Speaker Embeddings
Lily Wadoux ... Nelly Barbot
Proceedings of the Canadian Conference on Artificial Intelligence | VOL. -
Lily Wadoux, et. al.Lily Wadoux ... Nelly Barbot
27 May 2022
Proceedings of the Canadian Conference on Artificial Intelligence | VOL. -

MRMI-TTS: Multi-Reference Audios and Mutual Information Driven Zero-Shot Voice Cloning
Yi Ting Chen ... Wanting Li
ACM Transactions on Asian and Low-Resource Language Information Processing | VOL. 23
Yi Ting Chen, et. al.Yi Ting Chen ... Wanting Li
10 May 2024
ACM Transactions on Asian and Low-Resource Language Information Processing | VOL. 23

Physics-informed neural networks for spherical indentation problems
Karuppasamy Pandian Marimuthu ... Hyungyil Lee
Materials & Design | VOL. 236
Karuppasamy Pandian Marimuthu, et. al.Karuppasamy Pandian Marimuthu ... Hyungyil Lee
17 Nov 2023
Materials & Design | VOL. 236

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Voice Cloning Using Transfer Learning with Audio Samples

Abstract

Talk to us

Similar Papers

More From: UMT Artificial Intelligence Review