Abstract

Since basic processing units, i.e. the speech segment and the word, are different, careful balancing between acoustic and linguistic knowledge is needed in stochastic speech recognition systems. In most systems, combining acoustic and linguistic scores is controlled by a heuristic parameter, language weight. In order to achieve a robust balance, taking sequence length into account is important because the length of a speech segment is much shorter than a word duration. The length of the word sequence is not taken into account under n-gram language modeling, which is the most common stochastic language modeling for speech recognition. Due to this property, the optimal values of the language weight and word insertion penalty for balancing acoustic and linguistic probabilities is affected by the length of word sequence. To deal with this problem, a new language model is developed based on Bernoulli trial model taking the length of the word sequence into account. Not only better recognition accuracy but also robust balancing with acoustic probability compared with the normal n-gram model of the proposed method is confirmed through recognition experiments.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call