Probabilistic mapping networks for speaker recognition

Haizhou Li Haizhou Li,J.-P Haton,Yifan Gong Yifan Gong

doi:10.1109/icassp.1996.550601

Abstract

The expectation-maximization (EM) algorithm is a general technique for maximum likelihood estimation (MLE). In this paper, we present two important theoretical issues concerning Gaussian mixture modeling (GMM) within the EM framework. First, we propose an EM algorithm for estimating the parameters of a GMM structure dedicated to speaker recognition, the probabilistic mapping network (PMN), where the Gaussian probability density function is realized as an internal node. Hence, the EM algorithm is extended to deal with the supervised learning of a multicategory classification problem and serves as a parameter estimator of the neural network classifier. Then, a generalized EM (GEM) algorithm is developed as an alternative to the MLE problem of PMN. The effectiveness of the proposed PMN architecture and developed EM algorithms are assessed by conducting a set of speaker recognition experiments. It is shown that GEM converges faster than EM to the same solution space.

Full Text