Abstract
AbstractIn acoustic modeling for speech recognition, the Gaussian distribution or the Gaussian mixture distribution is widely used. The general reason for preference of the Gaussian distribution in the parametric modeling of an unknown ensemble is the central limit theorem. The Gaussian distribution has many properties that are theoretically clear. For the particular problem, however, in which the time series of an acoustic feature is to be modeled on the basis of a limited number of training samples for speech recognition, there is no guarantee that the method based on the Gaussian distribution is always optimal. Consequently, this paper proposes an acoustic modeling approach based on the generalized Laplacian distribution, which can represent a wider range of distribution shapes, including the Laplacian and Gaussian distributions. The formulation of the generalized Laplacian distribution and the method of estimation of the distribution parameters are described. The acoustic model with the generalized Laplacian mixture output distribution is constructed by retraining of the hidden Markov model with the Gaussian mixture output distribution. It is shown by a continuous speech recognition experiment using natural uttered speech that the recognition performance is improved compared to recognition based on the Gaussian mixture distribution. © 2002 Wiley Periodicals, Inc. Electron Comm Jpn Pt 2, 85(11): 32–42, 2002; Published online in Wiley InterScience (www.interscience.wiley.com). DOI 10.1002/ecjb.10093
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: Electronics and Communications in Japan (Part II: Electronics)
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.