Neural networks in articulatory speech analysis/synthesis

M G Rahim,W B Kleijn,J Schroeter

doi:10.1121/1.2029407

Abstract

A major difficulty in articulatory analysis/synthesis is the estimation of vocal-tract parameters from input speech. The use of neural networks to extract these parameters is more attractive than codebook look-up due to the lower computational complexity. For example, a multilayer perceptron (MLP) with two hidden layers, trained and evaluated on a small data set was shown to perform a reasonable mapping of acousticto-geometric parameters. Increasing the training data, however, revealed ambiguity in the mapping that could not be resolved by a single network. This paper addresses the problem using an assembly of MLP's, each designated to a specific region in the articulatory space. Training data were generated by randomly sampling the parameters of an articulatory model of the vocal system. The resultant vocal-tract shapes were clustered into 128 regions, and an MLP with one hidden layer was assigned to each of these regions for mapping 18 cepstral coefficients into ten tract areas, and a nasalization parameter. Networks were selected by dynamic programming, and were used to control a time-domain articulatory synthesizer. After training, significant perceptual and objective improvements were achieved relative to using a single MLP. Comparable performance to codebook look-up with dynamic programming was obtained. This model, however, requires only 4% of the storage needed for the codebook, and performs the mapping faster by a factor of 20.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Neural networks in articulatory speech analysis/synthesis

Abstract

Talk to us

Similar Papers

More From: The Journal of the Acoustical Society of America

Lead the way for us

Similar Papers

Committee neural network for estimating preconsolidation pressure from piezocone test result
Y.S Kim ... H.I Park
Engineering Computations | VOL. 29
Y.S Kim, et. al.Y.S Kim ... H.I Park
09 Nov 2012
Engineering Computations | VOL. 29

Performance analysis of all-optical logical gate using artificial neural network
Samaneh Hamedi ... Hamed Dehdashti Jahromi
Expert Systems with Applications | VOL. 178
Samaneh Hamedi, et. al.Samaneh Hamedi ... Hamed Dehdashti Jahromi
16 Apr 2021
Expert Systems with Applications | VOL. 178

다중 신경망을 이용한 인식단위 결합 기반의 인쇄체 문자인식
...
The KIPS Transactions:PartB | VOL. 10B
, et. al. ...
01 Dec 2003
The KIPS Transactions:PartB | VOL. 10B

Multilayer Perceptron Model for Nowcasting Visibility from Surface Observations: Results and Sensitivity to Dissimilar Station Altitudes
Sutapa Chaudhuri ... Debanjana Das
Pure and Applied Geophysics | VOL. 172
Sutapa Chaudhuri, et. al.Sutapa Chaudhuri ... Debanjana Das
28 Mar 2015
Pure and Applied Geophysics | VOL. 172

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Neural networks in articulatory speech analysis/synthesis

Abstract

Talk to us

Similar Papers

More From: The Journal of the Acoustical Society of America