Evaluation of Glottal Inverse Filtering Algorithms Using a Physiologically Based Articulatory Speech Synthesizer.

Jón Guðnason,Matías Zañartu,Yu-Ren Chien,Daryush D Mehta,Thomas F Quatieri,Daryush D Mehta

doi:10.1109/taslp.2017.2714839

Abstract

Glottal inverse filtering aims to estimate the glottal airflow signal from a speech signal for applications such as speaker recognition and clinical voice assessment. Nonetheless, evaluation of inverse filtering algorithms has been challenging due to the practical difficulties of directly measuring glottal airflow. Apart from this, it is acknowledged that the performance of many methods degrade in voice conditions that are of great interest, such as breathiness, high pitch, soft voice, and running speech. This paper presents a comprehensive, objective, and comparative evaluation of state-of-the-art inverse filtering algorithms that takes advantage of speech and glottal airflow signals generated by a physiological speech synthesizer. The synthesizer provides a physics-based simulation of the voice production process and thus an adequate test bed for revealing the temporal and spectral performance characteristics of each algorithm. Included in the synthetic data are continuous speech utterances and sustained vowels, which are produced with multiple voice qualities (pressed, slightly pressed, modal, slightly breathy, and breathy), fundamental frequencies, and subglottal pressures to simulate the natural variations in real speech. In evaluating the accuracy of a glottal flow estimate, multiple error measures are used, including an error in the estimated signal that measures overall waveform deviation, as well as an error in each of several clinically relevant features extracted from the glottal flow estimate. Waveform errors calculated from glottal flow estimation experiments exhibited mean values around 30% for sustained vowels, and around 40% for continuous speech, of the amplitude of true glottal flow derivative. Closed-phase approaches showed remarkable stability across different voice qualities and subglottal pressures. The algorithms of choice, as suggested by significance tests, are closed-phase covariance analysis for the analysis of sustained vowels, and sparse linear prediction for the analysis of continuous speech. Results of data subset analysis suggest that analysis of close rounded vowels is an additional challenge in glottal flow estimation.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Evaluation of Glottal Inverse Filtering Algorithms Using a Physiologically Based Articulatory Speech Synthesizer.

Abstract

Talk to us

Similar Papers

More From: IEEE/ACM transactions on audio, speech, and language processing

Lead the way for us

Journal: IEEE/ACM transactions on audio, speech, and language processing	Publication Date: Aug 1, 2017
Citations: 11

Similar Papers

Prediction of Voice Disorder Severity: Contributions from Sustained Vowels and Continuous Speech
Yuanyuan Liu ... Tan Lee
-
Yuanyuan Liu, et. al.Yuanyuan Liu ... Tan Lee
01 Nov 2018
01 Nov 2018

Glottal wave forms for normal female speakers
Inger Karlsson
Journal of Phonetics | VOL. 14
Inger KarlssonInger Karlsson
01 Oct 1986
Journal of Phonetics | VOL. 14

Toward Improved Ecological Validity in the Acoustic Measurement of Overall Voice Quality: Combining Continuous Speech and Sustained Vowels
Youri Maryn ... Paul Van Cauwenberge
Journal of Voice | VOL. 24
Youri Maryn, et. al.Youri Maryn ... Paul Van Cauwenberge
02 Nov 2009
Journal of Voice | VOL. 24

Acoustical Assessment of Voice Disorder With Continuous Speech Using ASR Posterior Features
Yuanyuan Liu ... Tan Lee
IEEE/ACM Transactions on Audio, Speech, and Language Processing | VOL. 27
Yuanyuan Liu, et. al.Yuanyuan Liu ... Tan Lee
01 Jun 2019
IEEE/ACM Transactions on Audio, Speech, and Language Processing | VOL. 27

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Evaluation of Glottal Inverse Filtering Algorithms Using a Physiologically Based Articulatory Speech Synthesizer.

Abstract

Talk to us

Similar Papers

More From: IEEE/ACM transactions on audio, speech, and language processing