Abstract

Decision tree state tying aims to perform divisive clustering, which can combine the phonetics and acoustics of speech signal for large vocabulary continuous speech recognition. A tree is built by successively splitting the observation frames of a phonetic unit according to the best phonetic questions. To prevent building over-large tree models, the stopping criterion is required to suppress tree growing. Accordingly, it is crucial to exploit the goodness-of-split criteria to choose the best questions for node splitting and test whether the splitting should be terminated or not. In this paper, we apply the Hubert's /spl Gamma/ statistic as the node splitting criterion and the T/sup 2/-statistic as the stopping criterion. The Hubert's /spl Gamma/ statistic sufficiently characterizes the clustering structure in the given data. This cluster validity criterion is adopted to select the best questions to unravel tree nodes. Further, we examine the population closeness of two split nodes with a significance level. The T/sup 2/-statistic expressed by an F distribution is determined to verify whether the mean vectors of two nodes are close together. The splitting is stopped when verified. In the experiments of Mandarin speech recognition, the proposed methods achieve better syllable recognition rates with smaller tree models compared to the conventional maximum likelihood and minimum description length criteria.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call