Abstract

Rate of speech (ROS) is a very important factor in speech recognition. We present a new speech rate measurement method which first normalizes the duration of different acoustic units to a standard duration and then builds a trigram duration model to measure the speech rate of a sentence. We propose two methods based on the standard duration to compensate the influence introduced by speech rate variation in a data corpus and get 11% error rate reduction in Mandarin digit string recognition.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call