Speaker adaptation based on combination of MAP estimation and weighted neighbor regression

Lei He Lei He,Jian Wu Jian Wu,Wenhu Wu Wenhu Wu,Ditang Fang Ditang Fang

doi:10.1109/icassp.2000.859126

Abstract

This paper describes a novel speaker adaptation method that combines the maximum a posteriori (MAP) estimation and the weighted neighbor regression (WNR). The primary disadvantage of MAP adaptation is that only the parameters of those models with adaptation data are updated, thus great deals of adaptation data are required. In this paper, a technique called WNR is presented, in which the information of model neighbors is used to overcome that problem. The parameter relationships between the speaker independent models and the speaker adaptation models are trained by applying the distance weighted regressions to a set of neighbor model parameters with and without MAP adaptation. It gives nearly 15 percent error rate reduction with 10 adaptation utterances and more than 51 percent with 250 utterances in Chinese syllable recognition. In addition, the vector field smoothing (VFS) can be proved to be a degenerate case of WNR.

Full Text