OPENGLOT – An open environment for the evaluation of glottal inverse filtering

Paavo Alku,Tiina Murtola,Jarmo Malinen,Juha Kuortti,Brad Story,Manu Airaksinen,Mika Salmi,Erkki Vilkman,Ahmed Geneid

doi:10.1016/j.specom.2019.01.005

Paavo Alku, Tiina Murtola + Show 7 more

Open Access

https://doi.org/10.1016/j.specom.2019.01.005

Copy DOI

Abstract

Glottal inverse filtering (GIF) refers to technology to estimate the source of voiced speech, the glottal flow, from speech signals. When a new GIF algorithm is proposed, its accuracy needs to be evaluated. However, the evaluation of GIF is problematic because the ground truth, the real glottal volume velocity signal generated by the vocal folds, cannot be recorded non-invasively from natural speech. This absence of the ground truth has been circumvented in most previous GIF studies by using simple linear source-filter synthesis techniques with known artificial glottal flow models and all-pole vocal tract filters. Moreover, in a few previous studies, physical modeling of speech production has been utilized in synthesis of the test data for GIF evaluation. The evaluation strategy in previous GIF studies is, however, scattered between individual investigations and there is currently a lack of a coherent, common platform to be used in GIF evaluation. In order to address this shortcoming, the current study introduces a new environment, called OPENGLOT, for GIF evaluation. The key ideas of OPENGLOT are twofold: the environment is versatile (i.e., it provides different types of test signals for GIF evaluation) and open (i.e., the system can be used by anyone who wants to evaluate her or his new GIF method and compare it objectively to previously developed benchmark techniques). OPENGLOT consists of four main parts, Repositories I–IV, that contain data and sound synthesis software. Repository I contains a large set of synthetic glottal flow waveforms, and speech signals generated by using the Liljencrants–Fant (LF) waveform as an artificial excitation, and a digital all-pole filter to model the vocal tract. Repository II contains glottal flow and speech pressure signals generated using physical modeling of human speech production. Repository III contains pairs of glottal excitation and speech pressure signal generated by exciting 3D printed plastic vocal tract replica with LF excitations via a loudspeaker. Finally, Repository IV contains multichannel recordings (speech pressure signal, electroglottogram, high-speed video of the vocal folds) from natural production of speech. After presenting these four core parts of OPENGLOT, the article demonstrates the platform by presenting a typical use case.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Speech Communication	Publication Date: Jan 31, 2019
Citations: 27	License type: cc-by-nc-nd

R Discovery Prime

R Discovery Prime

OPENGLOT – An open environment for the evaluation of glottal inverse filtering

Abstract

Talk to us

Similar Papers

More From: Speech Communication

Lead the way for us

Similar Papers

Glottal inverse filtering by combining a constrained LP and an HMM-based generative model of glottal flow derivative
Akira Sasou
Speech Communication | VOL. 104
Akira SasouAkira Sasou
20 Jul 2018
Speech Communication | VOL. 104

Parameterization of a computational physical model for glottal flow using inverse filtering and high-speed videoendoscopy
Tiina Murtola ... Ahmed Geneid
Speech Communication | VOL. 96
Tiina Murtola, et. al.Tiina Murtola ... Ahmed Geneid
11 Nov 2017
Speech Communication | VOL. 96

Estimation of the glottal source from coded telephone speech using deep neural networks
N.P Narendra ... Paavo Alku
Speech Communication | VOL. 106
N.P Narendra, et. al.N.P Narendra ... Paavo Alku
08 Dec 2018
Speech Communication | VOL. 106

Analysis of glottal inverse filtering in the presence of source-filter interaction
Anil Palaparthi ... Ingo R Titze
Speech Communication | VOL. 123
Anil Palaparthi, et. al.Anil Palaparthi ... Ingo R Titze
24 Jul 2020
Speech Communication | VOL. 123

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

OPENGLOT – An open environment for the evaluation of glottal inverse filtering

Abstract

Talk to us

Similar Papers

More From: Speech Communication