Unsupervised and phonologically controlled interpolation of Austrian German language varieties for speech synthesis

Markus Toman,Michael Pucher,Sylvia Moosmüller,Dietmar Schabus

doi:10.1016/j.specom.2015.06.005

Abstract

This paper presents an unsupervised method that allows for gradual interpolation between language varieties in statistical parametric speech synthesis using Hidden Semi-Markov Models (HSMMs). We apply dynamic time warping using Kullback–Leibler divergence on two sequences of HSMM states to find adequate interpolation partners. The method operates on state sequences with explicit durations and also on expanded state sequences where each state corresponds to one feature frame. In an intelligibility and dialect rating subjective evaluation of synthesized test sentences, we show that our method can generate intermediate varieties for three Austrian dialects (Viennese, Innervillgraten, Bad Goisern). We also provide an extensive phonetic analysis of the interpolated samples. The analysis includes input-switch rules, which cover historically different phonological developments of the dialects versus the standard language; and phonological processes, which are phonetically motivated, gradual, and common to all varieties. We present an extended method which linearly interpolates phonological processes but uses a step function for input-switch rules. Our evaluation shows that the integration of this kind of phonological knowledge improves dialect authenticity judgment of the synthesized speech, as performed by dialect speakers. Since gradual transitions between varieties are an existing phenomenon, we can use our methods to adapt speech output systems accordingly.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Speech Communication	Publication Date: Jun 12, 2015
Citations: 4	License type: cc-by-nc-nd

R Discovery Prime

R Discovery Prime

Unsupervised and phonologically controlled interpolation of Austrian German language varieties for speech synthesis

Abstract

Talk to us

Similar Papers

More From: Speech Communication

Lead the way for us

Similar Papers

Context-dependent acoustic modeling based on hidden maximum entropy model for statistical parametric speech synthesis
Soheil Khorram ... Thomas Drugman
EURASIP Journal on Audio, Speech, and Music Processing | VOL. 2014
Soheil Khorram, et. al.Soheil Khorram ... Thomas Drugman
07 Apr 2014
EURASIP Journal on Audio, Speech, and Music Processing | VOL. 2014

A study of speaker adaptation for DNN-based speech synthesis
...
-
, et. al. ...
08 Dec 2015
08 Dec 2015

Generative approach using the noise generation models for DNN-based speech synthesis trained from noisy speech
Masakazu Une ... Hiroshi Saruwatari
-
Masakazu Une, et. al.Masakazu Une ... Hiroshi Saruwatari
01 Nov 2018
01 Nov 2018

Voice and Speech Synthesis—Highlighting the Control of Prosody
Keikichi Hirose
-
Keikichi HiroseKeikichi Hirose
06 Dec 2018
06 Dec 2018

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Unsupervised and phonologically controlled interpolation of Austrian German language varieties for speech synthesis

Abstract

Talk to us

Similar Papers

More From: Speech Communication