A new fuzzy unit selection cost function optimized by relaxed gradient descent algorithm

Matej Rojc,Izidor Mlakar

doi:10.1016/j.eswa.2020.113552

Abstract

In data-driven corpus-based text-to-speech synthesis systems, the main issue is to select the most natural-sounding sequence of acoustic units without unnatural acoustic transitions, and to minimize all acoustic mismatches at the concatenation points. Unit selection algorithms incorporating unit selection cost functions have been known to synthesize speech close to natural quality. However, these algorithms operate over large acoustic inventories with huge number of acoustic units in a broad spectrum of linguistic, prosodic and acoustic contexts, and with a huge number of concatenation possibilities. Moreover, the shape of the unit selection cost function, which evaluates the cost of concatenating two subsequent acoustic units, is modelled manually in a time-consuming and laborious iterative process, which is based on subjective evaluation. Since this process must be repeated for any new acoustic inventory, or even after changes in a given acoustic inventory, we propose instead a new fuzzy unit selection cost function. We further propose to optimize fully automatically the shape of the fuzzy unit selection cost function to the given acoustic inventory’s context by using a relaxed gradient descent algorithm, where the subjective tests are replaced by a novel objective measure needed to evaluate unit selection cost function performance. Furthermore, the proposed approach is fully interpretable and also highlights insights into which parts of the fuzzy unit selection cost function’s shape could be further improved. The experiments show that the optimized fuzzy unit selection cost function significantly outperforms the baseline fuzzy unit selection cost function. Moreover, the results prove that the unit selection optimization algorithm is capable of finding the optimal shape of the fuzzy unit selection cost function, even when optimized over a small subset of sentences.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A new fuzzy unit selection cost function optimized by relaxed gradient descent algorithm

Abstract

Talk to us

Similar Papers

More From: Expert Systems with Applications

Lead the way for us

Journal: Expert Systems with Applications	Publication Date: May 15, 2020
Citations: 7

Similar Papers

An LSTM-based model for the compression of acoustic inventories for corpus-based text-to-speech synthesis systems
Matej Rojc ... Izidor Mlakar
Computers and Electrical Engineering | VOL. 100
Matej Rojc, et. al.Matej Rojc ... Izidor Mlakar
06 Apr 2022
Computers and Electrical Engineering | VOL. 100

A New Unit Selection Optimisation Algorithm for Corpus-Based TTS Systems Using the RBF-Based Data Compression Technique
Matej Rojc ... Izidor Mlakar
IEEE Access | VOL. 7
Matej Rojc, et. al.Matej Rojc ... Izidor Mlakar
01 Jan 2019
IEEE Access | VOL. 7

Perceptual evaluation of dynamic cost weighting for unit selection TTS
Jerome R Bellegarda
-
Jerome R BellegardaJerome R Bellegarda
01 Jan 2009
01 Jan 2009

GRADIENT-DESCENT BASED UNIT-SELECTION OPTIMIZATION ALGORITHM USED FOR CORPUS-BASED TEXT-TO-SPEECH SYNTHESIS
Matej Rojc ... Zdravko Kačič
Applied Artificial Intelligence | VOL. 25
Matej Rojc, et. al.Matej Rojc ... Zdravko Kačič
01 Aug 2011
Applied Artificial Intelligence | VOL. 25

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A new fuzzy unit selection cost function optimized by relaxed gradient descent algorithm

Abstract

Talk to us

Similar Papers

More From: Expert Systems with Applications