Speaker adaptation through spectral transformation for HMM based speech recognition

H.C Choi,R.W King

doi:10.1109/sipnn.1994.344819

Abstract

The use of spectral transformation to perform speaker adaptation for HMM based speech recognition is investigated. Three estimation methods, namely, minimum mean square error (MMSE), canonical correlation analysis (CCA) and multilayer perceptrons (MLP), for computing the transformation are compared. Using isolated words from the TI-46 database, it is found that CCA has the best adaptation performance. Moreover, a training-after-adaptation approach is found to have a higher adaptation performance than the one in which reference HMMs are not re-trained. With a suitable choice of reference speaker, less than 30% of training data from a new speaker is required in order to achieve the same accuracy as the speaker-dependent models of that new speaker, when the CCA method is used with the training-after-adaptation approach. >

Full Text