Abstract

We have already proposed the application of tree-structured speaker clustering to supervised speaker adaptation. This paper proposes its application to unsupervised speaker adaptation and speaker-independent (SI) speech recognition. This clustering involves the selection of a speaker cluster from among multiple reference speaker clusters arranged in a tree structure. Cluster selection, unlike parameter training, enables quick adaptation using only a small amount of training data. This method was applied to a hidden Markov network (HMnet) and evaluated in Japanese phoneme and phrase recognition experiments. Results show effective unsupervised speaker adaptation using only 5 s calibration speech. In the SI speech recognition experiments, the method reduced the error rate by 8·5% compared with the conventional speaker-independent speech recognition method.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call