Abstract
The standard hidden Markov models (HMM) assume local or state-conditioned stationarity of the signals being modeled. In this article, we present some recent development in generalizing the standard HMM to incorporate the local dynamic patterns as well as the global non-stationarity for speech signal modeling. The major component of the proposed non-stationary HMMs is the parametric regression models for individual HMM states. The regression functions are intended for characterizing the dynamic movements of the signals within a HMM state. Both the EM algorithm (or Baum-Welch algorithm) and the segmental K-means algorithms are generalized to accommodate the complex state duration information needed for the estimation of regression parameters. To allow for the flexibility of linear time warping in individual HMM states, an efficient algorithm is developed with the use of token-dependent auxiliary parameters. Although the auxiliary parameters are of no interest in themselves for modeling speech sound patterns, they provide an intermediate tool for achieving maximal accuracy in estimating the parameters of the regression models.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.