Voice mimic system using an articulatory codebook for estimation of vocal tract shape

Samir Chennoukh,James L Flanagan,Daniel Sinder,Gael Richard

doi:10.21437/eurospeech.1997-179

Abstract

VOICEMIMICSYSTEMUSINGANARTICULATORYCODEBOOKFORESTIMATIONOFVOCALTRACTSHAPES. Chennoukh, D. Sinder, G. Richard* and J.L. FlanaganCenter for Computer Aids for Industrial Pro ductivity (CAIP), Rutgers University,Piscataway, NJ 08855-1390, USA*Matra-Communication, rue J.P. Timbaud, 78392 Bois d'Arcy,FranceTel.+1 908-445-0080, FAX: +1 908 445-4775, E-mail:chenoukh@caip.rutgers.eduABSTRACTVoice mimic systems using articulatory co deb o oks re-quireaninitialestimateofthevo caltractshap einthe vicinity of the global optimum.For this purp ose,we need to gather a large set of corresp onding articu-latory and acoustic data in the articulatory co deb o ok.Thus, searching and accessing the co deb o ok b ecomesa dicult task.In this pap er, the design of an artic-ulatory co deb o ok is presented where an acoustic net-work sub-samples the acoustic space such that vo caltract mo del shap es are ordered and clustered in thenetwork according toacousticparameters.Anotherissue addressed in this pap er concerns estimating thetra jectory of vo cal tract shap es as they change withtime.Sincetheinversemappingfromacousticpa-rameters to mo del shap e do es not have a unique so-lution, several vo cal tract shap e variations are p ossi-ble.Therefore, a dynamic optimization of tra jectorieshas b een develop ed.This optimization uses dynamicprop erties of each articulatory parameter to estimatethe next p osition.1.INTRODUCTIONThestudyofsp eechp erceptionandpro duc-tion has b een enhanced in the last two decades by thedevelopment of computers capable of large amountofcomputation.As a result, Stevens' study towards anarticulatorymo delforsp eechrecognition-synthesisb ecomesmorefeasiblethanitwasintheearlysix-ties([9]).However, an incomplete understanding ofsp eechpro ductionandtheacousticsofpre-ventedusfromachievingStevens'goal.Thegoalwastomimicinputsp eechsignalsbyrecognition-synthesis using a mo del of the vo cal tract area func-tionthatcanmimicthesp eechsignalswithoutun-derstanding their structure or meaning.An early attempt at creating a complete computersimulation of articulatory mo del sp eech co ding usingan optimization technique was rep orted by Flanaganetal.([4]).Thesimulationiscalled\voice mimic.The voice mimic attempts to provide an articulatorydescription of the vo cal tract that corresp onds to anarbitrary natural sp eech input and to generate a syn-thetic signal that, within p erceptual accuracy, dupli-catesthenaturalone.Centraltoe ortisinverse mapping from an acoustic signal to an articu-latory description.However, acoustic-to-articulatorymappings are non-unique and, given a cost function,the optimization techniques converge only to a lo calextremum that may b e near the vicinity of the initialparameters.Therefore, one needs to cho ose accuratestartup parameters to initialize the optimization pro-cedure.Schro eter andSondhi([8]),whocontinuedalong the same lines of Flanagan et al.'s study, usedan articulatory co deb o ok prop osed earlier byAtal etal.([1]).Since a co deb o ok is used to obtain the rstestimates of the vo cal tract shap e that may pro ducea given combination of acoustic parameters, it mustbedesignedsuchthatitspansthenatural articula-tory space of a sp eaker.Furthermore, sampling of thespace must b e ne enough so that an acoustic entryalways exists very close to the global optimum.Suchco deb o oks require a large set of matching pairs of vo-cal tract and acoustic parameters.The complexityofsearching a large co deb o ok for all p ossible vo cal tractmo del shap es b ecomes an issue.For this reason, thevoice mimic system needs, in addition to a go o d artic-ulatory co deb o ok, an ecient pro cedure for accessingthe co deb o ok ([6],[7]).The numb er and p osition of the co deb o ok vectorsa ect the p erformance of the voice mimic system ac-cording to two compromising problems.On one hand,increasing the size of the co deb o ok increases the dif- cultyoftheaccesstaskand,onotherhand,reductionofthissizecomplicatestheinverseprob-lemsolution.Inthesecond sectionofthispap er,anew design of the articulatory co deb o ok is presentedfor which the inversion of the articulatory-to-acousticmapping is pro cessed during the building of the co de-b o ok.Thisco deb o okdesignallowsreal-time accessto the set of acoustically equivalent shap es, regardlessthe size of the co deb o ok.Sincetheinversemappingfromacousticparam-eterstomo delshap edo esnothaveauniquesolu-tion,severalvo caltractshap eariationsarep ossi-ble.Schro eter and Sondhi([7]) prop osed the use ofdynamicprogrammingtoestimatetheoptimaltra-jectory of the vo cal tract mo del shap e variation path.The dynamic programming requires a delay of severaldata frames for the sp eech output ([8]).In the thirdsection, a metho d is prop osed where the articulatoryparameters are estimated within one frame.Section

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Voice mimic system using an articulatory codebook for estimation of vocal tract shape

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Acoustic vocal tract model of one-year-old children
Milan Vojnovic ... Ljiljana Dobrijevic
Telfor Journal | VOL. 6
Milan Vojnovic, et. al.Milan Vojnovic ... Ljiljana Dobrijevic
01 Jan 2014
Telfor Journal | VOL. 6

Speaker based vocal tract shape estimation for kannada vowels
Shiva Prasad K M ... M.B Manjunatha
-
Shiva Prasad K M, et. al. Shiva Prasad K M ... M.B Manjunatha
01 Jan 2015
01 Jan 2015

Analysis of Vocal Tract Shape Variability based on Formant Frequency Ratio at Various Conditions of Vowels for Indian English Speakers
Anil Kumar Chandrashekar ... M B Manjunatha
Indian Journal of Science and Technology | VOL. 10
Anil Kumar Chandrashekar, et. al.Anil Kumar Chandrashekar ... M B Manjunatha
01 Apr 2017
Indian Journal of Science and Technology | VOL. 10

Comparison of vocal tract shape estimation techniques based on formant frequencies, autocorrelation, covariance and lattice
Ashwini S Patil ... Milind S Shah
-
Ashwini S Patil, et. al.Ashwini S Patil ... Milind S Shah
01 Jan 2015
01 Jan 2015

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Voice mimic system using an articulatory codebook for estimation of vocal tract shape

Abstract

Talk to us

Similar Papers