Abstract

Most current automatic speech recognition systems based on HMMs cluster or tie together subsets of the subword units with which speech is represented. This tying improves the recognition accuracy when systems are trained with limited data, and is performed by classifying the sub-phonetic units using a series of binary tests based on speech production, called linguistic This paper describes a new method for automatically determining the best combinations of subword units to form these questions. The hybrid algorithm proposed clusters state distributions of context-independent phones to obtain questions for triphonetic contexts. Experiments confirm that the questions thus generated can replace manually generated questions and can provide improved recognition accuracy. Automatic generation of questions has the additional important advantage of extensibility to languages for which the phonetic structure is not well understood by the system designer, and can be effectively used in situations where the subword units are not phonetically motivated.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call