Lombard speech synthesis using long short-term memory recurrent neural networks

Bajibabu Bollepalli,Manu Airaksinen,Paavo Alku

doi:10.1109/icassp.2017.7953209

Abstract

In statistical parametric speech synthesis (SPSS), a few studies have investigated the Lombard effect, specifically by using hidden Markov model (HMM)-based systems. Recently, artificial neural networks have demonstrated promising results in SPSS, specifically by using long short-term memory recurrent neural networks (LSTMs). The Lombard effect, however, has not been studied in the LSTM-based speech synthesis systems. In this study, we propose three methods for Lombard speech adaptation in LSTM-based speech synthesis. In particular, (1) we augment Lombard specific information with the linguistic features as input, (2) scale the hidden activations using the learning hidden unit contributions (LHUC) method, and (3) fine-tune the LSTMs trained on normal speech with a small Lombard speech data. To investigate the effectiveness of the proposed methods, we carry out experiments using small (10 utterances) and large (500 utterances) Lombard speech data. Experimental results confirm the adaptability of the LSTMs, and similarity tests show that the LSTMs can achieve significantly better adaptation performance than the HMMs in both small and large data conditions.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Lombard speech synthesis using long short-term memory recurrent neural networks

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Fast, Compact, and High Quality LSTM-RNN Based Statistical Parametric Speech Synthesizers for Mobile Devices
Heiga Zen ... Niels Egberts
-
Heiga Zen, et. al.Heiga Zen ... Niels Egberts
08 Sep 2016
08 Sep 2016

A study of speaker adaptation for DNN-based speech synthesis
...
-
, et. al. ...
08 Dec 2015
08 Dec 2015

Phase perception of the glottal excitation and its relevance in statistical parametric speech synthesis
Tuomo Raitio ... Paavo Alku
Speech Communication | VOL. 81
Tuomo Raitio, et. al.Tuomo Raitio ... Paavo Alku
15 Feb 2016
Speech Communication | VOL. 81

Normal-to-Lombard adaptation of speech synthesis using long short-term memory recurrent neural networks
Bajibabu Bollepalli ... Paavo Alku
Speech Communication | VOL. 110
Bajibabu Bollepalli, et. al.Bajibabu Bollepalli ... Paavo Alku
18 Apr 2019
Speech Communication | VOL. 110

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Lombard speech synthesis using long short-term memory recurrent neural networks

Abstract

Talk to us

Similar Papers