참조화자로부터 추정된 적응적 혼성 사전분포를 이용한 MAPLR 고속 화자적응

Young-Rok Song,Hyung-Soon Kim

doi:10.7776/ask.2011.30.6.315

Abstract

본 논문은 maximum a posteriori linear regression (MAPLR) 기반의 고속 화자적응 성능을 개선하기 위하여 사전분포를 추정하는 두 가지 방식을 제안한다. 일반적으로 MAPLR 방식에서 사용되는 변환행렬의 사전분포는 화자독립모델을 구성하는 훈련 화자들로부터 추정되어 모든 화자들에게 동등하게 적용된다. 본 논문에서는 새로운 화자에게 보다 더 적합한 사전분포를 적용하고자 적응 데이터를 이용하여 새로운 화자의 음향특성과 가까운 참조화자 집단을 선택한 후 참조화자 집단으로부터 사전분포를 추정하는 방법을 제안한다. 또한, 블록 대각 형태의 변환행렬의 사전분포를 추정하는 경우 사전분포의 평균행렬과 공분산행렬을 동일한 훈련 화자들로부터 얻어진 두 가지 형태의 변환행렬집단으로부터 각각 추정하는 방법을 제안한다. 제안된 방법의 성능 평가를 위하여 고립단어 인식실험을 통해 적응 단어의 개수에 따른 단어 인식률을 평가한다. 실험결과, 적응 단어 수가 매우 적을 때 기존의 MAPLR 방식에 비하여 통계적으로 유의미한 성능향상이 얻어짐을 보여준다. This paper proposes two methods of estimating prior distribution to improve the performance of rapid speaker adaptation based on maximum a posteriori linear regression (MAPLR). In general, prior distribution of the transformation matrix used in MAPLR adaptation is estimated from all of the training speakers who are employed to construct the speaker-independent model, and it is applied identically to all new speakers. In this paper, we propose a method in which prior distribution is estimated from a group of reference speakers, selected using adaptation data, so that the acoustic characteristics of the selected reference speakers may be similar to that of the new speaker. Additionally, in MAPLR adaptation with block-diagonal transformation matrix, we propose a method in which the mean matrix and covariance matrix of prior distribution are estimated from two groups of transformation matrices obtained from the same training speakers, respectively. To evaluate the performance of the proposed methods, we examine word accuracy according to the number of adaptation words in the isolated word recognition task. Experimental results show that, for very limited adaptation data, statistically significant performance improvement is obtained in comparison with the conventional MAPLR adaptation.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

참조화자로부터 추정된 적응적 혼성 사전분포를 이용한 MAPLR 고속 화자적응

Abstract

Talk to us

Similar Papers

More From: The Journal of the Acoustical Society of Korea

Lead the way for us

Similar Papers

Aggregate a posteriori linear regression adaptation
Jen-Tzung Chien ... Chih-Hsien Huang
IEEE Transactions on Audio, Speech and Language Processing | VOL. 14
Jen-Tzung Chien, et. al. Jen-Tzung Chien ... Chih-Hsien Huang
01 May 2006
IEEE Transactions on Audio, Speech and Language Processing | VOL. 14

Maximum a posteriori linear regression (MAPLR) variance adaptation for continuous density HMMS
Wu Chou ... Xiaodong He
-
Wu Chou, et. al.Wu Chou ... Xiaodong He
01 Sep 2003
01 Sep 2003

Aggregate a Posteriori Linear Regression for Speaker Adaptation
Chih-Hsien Huang ... Jen-Tzung Chien
-
Chih-Hsien Huang, et. al. Chih-Hsien Huang ... Jen-Tzung Chien
18 Mar 2005
18 Mar 2005

Linear regression based bayesian predictive classification for speech recognition
Jen-Tzung Chien
IEEE Transactions on Speech and Audio Processing | VOL. 11
Jen-Tzung Chien Jen-Tzung Chien
01 Jan 2003
IEEE Transactions on Speech and Audio Processing | VOL. 11

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

참조화자로부터 추정된 적응적 혼성 사전분포를 이용한 MAPLR 고속 화자적응

Abstract

Talk to us

Similar Papers

More From: The Journal of the Acoustical Society of Korea