Dynamic Speech Spectrum Representation and Tracking Variable Number of Vocal Tract Resonance Frequencies With Time-Varying Dirichlet Process Mixture Models

Emre Ozkan,I Ozbek,Mubeccel Demirekler

doi:10.1109/tasl.2009.2022198

Abstract

<para xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> In this paper, we propose a new approach for dynamic speech spectrum representation and tracking vocal tract resonance (VTR) frequencies. The method involves representing the spectral density of the speech signals as a mixture of Gaussians with unknown number of components for which time-varying Dirichlet process mixture model (DPM) is utilized. In the resulting representation, the number of formants is allowed to vary in time. The paper first presents an analysis on the continuity of the formants in the spectrum during the speech utterance. The analysis is based on a new state space representation of concatenated tube model. We show that the number of formants which appear in the spectrum is directly related to the location of the constriction of the vocal tract (i.e., the location of the excitation). Moreover, the disappearance of the formants in the spectrum is explained by “uncontrollable modes” of the state space model. Under the assumption of existence of varying number of formants in the spectrum, we propose the use of a DPM model based multi-target tracking algorithm for tracking unknown number of formants. The tracking algorithm defines a hierarchical Bayesian model for the unknown formant states and the inference is done via Rao–Blackwellized particle filter. </para>

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Dynamic Speech Spectrum Representation and Tracking Variable Number of Vocal Tract Resonance Frequencies With Time-Varying Dirichlet Process Mixture Models

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Audio, Speech, and Language Processing

Lead the way for us

Journal: IEEE Transactions on Audio, Speech, and Language Processing	Publication Date: Nov 1, 2009
Citations: 54

Similar Papers

Human vocal tract resonances and the corresponding mode shapes investigated by three-dimensional finite-element modelling based on CT measurement
Tomáš Vampola ... Jan G Švec
Logopedics Phoniatrics Vocology | VOL. 40
Tomáš Vampola, et. al.Tomáš Vampola ... Jan G Švec
21 Mar 2013
Logopedics Phoniatrics Vocology | VOL. 40

Tracking of vocal tract resonances based on dynamic programming and Kalman filtering
I.Yucel Ozbek ... Mubeccel Demirekler
-
I.Yucel Ozbek, et. al.I.Yucel Ozbek ... Mubeccel Demirekler
01 Apr 2008
01 Apr 2008

RB Particle Filter Time Synchronization Algorithm Based on the DPM Model.
Chunsheng Guo ... Na Ying
Sensors | VOL. 15
Chunsheng Guo, et. al.Chunsheng Guo ... Na Ying
03 Sep 2015
Sensors | VOL. 15

Vocal Tract and Subglottal Impedance in High Performance Singing: A Case Study
Patrick Hoyer ... Simone Graf
Journal of Voice | VOL. 38
Patrick Hoyer, et. al.Patrick Hoyer ... Simone Graf
01 Feb 2022
Journal of Voice | VOL. 38

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Dynamic Speech Spectrum Representation and Tracking Variable Number of Vocal Tract Resonance Frequencies With Time-Varying Dirichlet Process Mixture Models

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Audio, Speech, and Language Processing