Abstract

The paper presents a voice source waveform modeling techniques based on principal component analysis (PCA) and Gaussian mixture modeling (GMM). The voice source is obtained by inverse-filteirng speech with the estimated vocal tract filter. This decomposition is useful in speech analysis, synthesis, recognition and coding. Existing models of the voice source signal are based on function-fitting or physically motivated assumptions and although they are well defined, estimation of their parameters is not well understood and few are capable of reproducing the large variety of voice source waveforms. Here, a data-driven approach is presented for signal decomposition and classification based on the principal components of the voice source. The principal components are analyzed and the ‘prototype’ voice source signals corresponding to the Gaussian mixture means are examined. We show how an unknown signal can be decomposed into its components and/or prototypes and resynthesized. We show how the techniques are suited for both low bitrate or high quality analysis/synthesis schemes.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.