Abstract

A new method is proposed for modelling state duration in hidden Markov model (HMM) speech recognition systems. State transition probabilities are expressed conditional on how long the current state has been occupied. The conventional fixed-state transition probabilities a ij are replaced by duration-dependent variables a ij ( d) that depend on the time d already spent in state i. In this way, state transition and state duration probabilities are combined to form duration-dependent transition probabilities. The transition probabilities are derived from the cumulative density function (CDF) of state duration. The training of HMM s with duration-dependent transitions are based on maximum likelihood segmentation of training data, using the Viterbi algorithm. At each training iteration, the current HMM parameters are used to segment every training example. All the segments associated with each state are then used to update state observation and transition parameters. In experiments with a data set of spoken English alphabet, durational modelling improves the recognition accuracy by 5.6%.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call