A continuous vocoder for statistical parametric speech synthesis and its evaluation using an audio-visual phonetically annotated Arabic corpus

Mohammed Salah Al-Radhi,Omnia Abdo,Tamás Gábor Csapó,Sherif Abdou,Géza Németh,Mervat Fashal

doi:10.1016/j.csl.2019.101025

Abstract

In this paper, we present an extension of a novel continuous residual-based vocoder for statistical parametric speech synthesis by addressing two objectives. First, because the noise component is often not accurately modelled in modern vocoders (e.g. STRAIGHT), a new technique for modelling unvoiced sounds is proposed by adding time domain envelope to the unvoiced segments to avoid any residual buzziness. Four time-domain envelopes (Amplitude, Hilbert, Triangular and True) are investigated, enhanced, and then applied to the noise component of the excitation in our continuous vocoder, i.e. of which all parameters are continuous. With the future aim of producing high-quality Arabic speech synthesis, we secondly apply this vocoder on a modern standard Arabic audio-visual corpus which is annotated both phonetically and visually, and dedicated to emotional speech processing studies.In an objective experiment, we investigated the Phase Distortion Deviation, whereas a MUSHRA type subjective listening test was conducted comparing natural and vocoded speech samples. As a result, both experiments based on the proposed noise modelling have shown satisfactory results in terms of naturalness and intelligibility, while outperforming STRAIGHT and other earlier residual-based approaches.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A continuous vocoder for statistical parametric speech synthesis and its evaluation using an audio-visual phonetically annotated Arabic corpus

Abstract

Talk to us

Similar Papers

More From: Computer Speech & Language

Lead the way for us

Journal: Computer Speech & Language	Publication Date: Sep 29, 2019
Citations: 7

Similar Papers

Voice and Speech Synthesis—Highlighting the Control of Prosody
Keikichi Hirose
-
Keikichi HiroseKeikichi Hirose
06 Dec 2018
06 Dec 2018

Multi-speaker modeling with shared prior distributions and model structures for Bayesian speech synthesis
Kei Hashimoto ... Yoshihiko Nankaku
-
Kei Hashimoto, et. al.Kei Hashimoto ... Yoshihiko Nankaku
27 Aug 2011
27 Aug 2011

A Review of Deep Learning Based Speech Synthesis
Yishuang Ning ... Sheng He
Applied Sciences | VOL. 9
Yishuang Ning, et. al.Yishuang Ning ... Sheng He
27 Sep 2019
Applied Sciences | VOL. 9

A deterministic plus noise model of excitation signal using principal component analysis for parametric speech synthesis
N P Narendra ... K Sreenivasa Rao
-
N P Narendra, et. al.N P Narendra ... K Sreenivasa Rao
01 Mar 2016
01 Mar 2016

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A continuous vocoder for statistical parametric speech synthesis and its evaluation using an audio-visual phonetically annotated Arabic corpus

Abstract

Talk to us

Similar Papers

More From: Computer Speech &amp; Language

More From: Computer Speech & Language