Coarticulation modeling by linear filters in a pseudo-articulatory domain

Raimo Bakis

doi:10.1121/1.409484

Abstract

A speech synthesizer for use in analysis-by-synthesis speech recognition is described. Coarticulation is modeled by applying linear FIR filters to target gestures in a pseudo-articulatory domain. This domain is treated as a hidden, unobservable layer between the phonetic input to the synthesizer and the acoustic spectrum output. That output is obtained from the hidden-layer signal by means of a memoryless nonlinear transformation that is implemented by a neural net with elliptic basis functions. The entire model, including phonetic targets, FIR filter shapes, and neural-net parameters, is trained by a pre-conditioned conjugate gradient method, using the mean-squared error between synthetic and actual spectra as the objective function. The gradient is calculated by a back-propagation algorithm. It is found that after training, the FIR filter shapes typically resemble noncausal two-pole lowpass characteristics, with one pole in the right and the other in the left half-plane, representing right-to-left (anticipatory) and left-to-right coarticulation, respectively. Performance of the synthesizer on isolated-word speech, continuous read speech, and spontaneous speech will be described, and results from recognition experiments on speaker-dependent and speaker-independent tasks will be reported.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Coarticulation modeling by linear filters in a pseudo-articulatory domain

Abstract

Talk to us

Similar Papers

More From: The Journal of the Acoustical Society of America

Lead the way for us

Similar Papers

Introduction to digital signal processing and filter design

Choice Reviews Online | VOL. 43

01 Apr 2006
Choice Reviews Online | VOL. 43

Efficient uniform digital filter bank with linear phase and FRM technique for hearing aids
G Parameshappa ... D Jayadevapp
International Journal of Engineering & Technology | VOL. 7
G Parameshappa, et. al.G Parameshappa ... D Jayadevapp
01 Mar 2018
International Journal of Engineering & Technology | VOL. 7

Continuous Bangla Speech Processing: Segmentation, Classification and Recognition
Dr M M Rahman
-
Dr M M RahmanDr M M Rahman
23 Feb 2022
23 Feb 2022

Source and relay matrices optimization for multiuser multi-hop MIMO relay systems
Yue Rong
-
Yue RongYue Rong
01 Nov 2011
01 Nov 2011

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Coarticulation modeling by linear filters in a pseudo-articulatory domain

Abstract

Talk to us

Similar Papers

More From: The Journal of the Acoustical Society of America