Rapid Speaker Adaptation Based on Combination of KPCA and Latent Variable Model

Zohreh Ansari,Seyed Jahanshah Kabudian,Farshad Almasganj

doi:10.1007/s00034-021-01660-6

Abstract

Speaker adaptation is implemented in order to shift the speaker-independent model closer to the new speaker speech characteristics to improve the speech recognition performance. The kernel eigenspace-based speaker adaptation methods provide satisfactory performance using only a small amount of adaptation data. In such adaptation methods, kernel principal component analysis (KPCA) is applied to the training speaker space in order to create kernel eigenspace. Then, the adapted acoustic model to the new user is calculated in that space. One limitation of KPCA is its inability to define a precise pre-image of the model adapted in the kernel eigenspace, back to the speaker space. Therefore, a huge amount of computations is required to perform adaptation. The previously developed solutions for calculation of an approximate pre-image of the adapted model do not necessarily lead to the optimal conditions. Therefore, in this paper, we propose an efficient solution for this problem to construct more reliable pre-image of the adapted model in the speaker space. For this purpose, we benefit from the latent variable model to define a probabilistic model for description of the applied mapping between the kernel eigenspace and the speaker space. The experiments were conducted on two speech databases: FARSDAT, a Persian, and TIMIT, an English speech database. Implementing a typical HMM-based automatic speech recognition system, it was verified that the proposed method, utilizing about three seconds of adaptation data, achieves up to 4.4% and 7.6% relative phoneme recognition accuracy rate over the speaker-independent model on FARSDAT and TIMIT, respectively. Moreover, the proposed approach demonstrated superior performance compared to the other kernel eigenspace-based adaptation methods.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Rapid Speaker Adaptation Based on Combination of KPCA and Latent Variable Model

Abstract

Talk to us

Similar Papers

More From: Circuits, Systems, and Signal Processing

Lead the way for us

Journal: Circuits, Systems, and Signal Processing	Publication Date: Mar 22, 2021
Citations: 1

Similar Papers

A comparative study of two kernel eigenspace-based speaker adaptation methods on large vocabulary continuous speech recognition
Roger Hsiao ... Brian Mak
-
Roger Hsiao, et. al.Roger Hsiao ... Brian Mak
04 Sep 2005
04 Sep 2005

Speaker adaptive bottleneck features extraction for LVCSR based on discriminative learning of speaker codes
Changqing Kong ... Hui Jiang
-
Changqing Kong, et. al.Changqing Kong ... Hui Jiang
01 Sep 2014
01 Sep 2014

Fast Adaptation of Deep Neural Network Based on Discriminant Codes for Speech Recognition
Shaofei Xue ... Qingfeng Liu
IEEE/ACM Transactions on Audio, Speech, and Language Processing | VOL. 22
Shaofei Xue, et. al. Shaofei Xue ... Qingfeng Liu
01 Dec 2014
IEEE/ACM Transactions on Audio, Speech, and Language Processing | VOL. 22

Rapid speaker adaptation based on D-code extracted from BLSTM-RNN in LVCSR
Shaofei Xue ... Zhiying Huang
-
Shaofei Xue, et. al.Shaofei Xue ... Zhiying Huang
01 Oct 2016
01 Oct 2016

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Rapid Speaker Adaptation Based on Combination of KPCA and Latent Variable Model

Abstract

Talk to us

Similar Papers

More From: Circuits, Systems, and Signal Processing