Speech Time-Scale Modification With GANs

Eyal Cohen,Felix Kreuk,Joseph Keshet

doi:10.1109/lsp.2022.3164361

Abstract

While listening to spoken content, it is often desired to vary the speech rate while preserving the speaker’s timbre and pitch. To date, advanced signal processing techniques are used to address this task, but it still remains a challenge to maintain a high speech quality at all time-scales. Inspired by the success of speech generation using Generative Adversarial Networks (GANs), we propose a novel unsupervised learning algorithm for time-scale modification (TSM) of speech, called ScalerGAN. The model is trained using a set of speech utterances, where no time-scales are provided. The ScalerGAN algorithm is composed of a generator that gets as input speech with the desired rate and outputs a time-adjusted speech; a discriminator that works on various spectrum scales; and a decoder that converts the time-adjusted signal back to the original rate to maintain consistency. Using an A/B test and conditional A/B test, human listeners were asked to compare ScalerGAN with other state-of-the-art TSM methods. The results showed that the speech quality of ScalerGAN outperforms all other methods.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Speech Time-Scale Modification With GANs

Abstract

Talk to us

Similar Papers

More From: IEEE Signal Processing Letters

Lead the way for us

Journal: IEEE Signal Processing Letters	Publication Date: Jan 1, 2022
Citations: 7

Similar Papers

Design of Objective Quality Measures for Time-Scale Modification of Audio

-

02 Feb 2021
02 Feb 2021

Non-uniform time scale modification using instants of significant excitation and vowel onset points
K Sreenivasa Rao ... Anil Kumar Vuppala
Speech Communication | VOL. 55
K Sreenivasa Rao, et. al.K Sreenivasa Rao ... Anil Kumar Vuppala
25 Mar 2013
Speech Communication | VOL. 55

Time-scale modification of speech using an incremental time-frequency approach with waveform structure compensation
B Sylvestre ... P Kabal
-
B Sylvestre, et. al.B Sylvestre ... P Kabal
01 Jan 1992
01 Jan 1992

A Review of Time-Scale Modification of Music Signals
Jonathan Driedger ... Meinard Müller
Applied Sciences | VOL. 6
Jonathan Driedger, et. al.Jonathan Driedger ... Meinard Müller
18 Feb 2016
Applied Sciences | VOL. 6

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Speech Time-Scale Modification With GANs

Abstract

Talk to us

Similar Papers

More From: IEEE Signal Processing Letters