DNN-based Speaker-adaptive Postfiltering with Limited Adaptation Data for Statistical Speech Synthesis Systems

Mirac Goksu Ozturk,Cenk Demiroglu,Okan Ulusoy

doi:10.1109/icassp.2019.8683714

Abstract

Deep neural networks (DNNs) have been successfully deployed for acoustic modelling in statistical parametric speech synthesis (SPSS) systems. Moreover, DNN-based postfilters (PF) have also been shown to outperform conventional postfilters that are widely used in SPSS systems for increasing the quality of synthesized speech. However, existing DNN-based postfilters are trained with speaker-dependent databases. Given that SPSS systems can rapidly adapt to new speakers from generic models, there is a need for DNN-based postfilters that can adapt to new speakers with minimal adaptation data. Here, we compare DNN-, RNN-, and CNN-based postfilters together with adversarial (GAN) training and cluster-based initialization (CI) for rapid adaptation. Results indicate that the feedforward (FF) DNN, together with GAN and CI, significantly outperforms the other recently proposed postfilters.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

DNN-based Speaker-adaptive Postfiltering with Limited Adaptation Data for Statistical Speech Synthesis Systems

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Constructing a Deep Neural Network Based Spectral Model for Statistical Speech Synthesis
Shinji Takaki ... Junichi Yamagishi
-
Shinji Takaki, et. al.Shinji Takaki ... Junichi Yamagishi
01 Jan 2015
01 Jan 2015

A comparison of Vietnamese Statistical Parametric Speech Synthesis Systems
Huy Kinh Phan ... Anh Tuan Dinh
-
Huy Kinh Phan, et. al.Huy Kinh Phan ... Anh Tuan Dinh
12 Nov 2020
12 Nov 2020

Multi-class learning algorithm for deep neural network-based statistical parametric speech synthesis
Eunwoo Song ... Hong-Goo Kang
-
Eunwoo Song, et. al.Eunwoo Song ... Hong-Goo Kang
01 Aug 2016
01 Aug 2016

A pitch-synchronous speech analysis and synthesis method for DNN-SPSS system
Jin-Seob Kim ... Inseon Jang
-
Jin-Seob Kim, et. al.Jin-Seob Kim ... Inseon Jang
01 Oct 2016
01 Oct 2016

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

DNN-based Speaker-adaptive Postfiltering with Limited Adaptation Data for Statistical Speech Synthesis Systems

Abstract

Talk to us

Similar Papers