Abstract

AbstractIn acoustic modeling for speech recognition, the Gaussian distribution or the Gaussian mixture distribution is widely used. The general reason for preference of the Gaussian distribution in the parametric modeling of an unknown ensemble is the central limit theorem. The Gaussian distribution has many properties that are theoretically clear. For the particular problem, however, in which the time series of an acoustic feature is to be modeled on the basis of a limited number of training samples for speech recognition, there is no guarantee that the method based on the Gaussian distribution is always optimal. Consequently, this paper proposes an acoustic modeling approach based on the generalized Laplacian distribution, which can represent a wider range of distribution shapes, including the Laplacian and Gaussian distributions. The formulation of the generalized Laplacian distribution and the method of estimation of the distribution parameters are described. The acoustic model with the generalized Laplacian mixture output distribution is constructed by retraining of the hidden Markov model with the Gaussian mixture output distribution. It is shown by a continuous speech recognition experiment using natural uttered speech that the recognition performance is improved compared to recognition based on the Gaussian mixture distribution. © 2002 Wiley Periodicals, Inc. Electron Comm Jpn Pt 2, 85(11): 32–42, 2002; Published online in Wiley InterScience (www.interscience.wiley.com). DOI 10.1002/ecjb.10093

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call