Abstract
Speaker adaptation is implemented in order to shift the speaker-independent model closer to the new speaker speech characteristics to improve the speech recognition performance. The kernel eigenspace-based speaker adaptation methods provide satisfactory performance using only a small amount of adaptation data. In such adaptation methods, kernel principal component analysis (KPCA) is applied to the training speaker space in order to create kernel eigenspace. Then, the adapted acoustic model to the new user is calculated in that space. One limitation of KPCA is its inability to define a precise pre-image of the model adapted in the kernel eigenspace, back to the speaker space. Therefore, a huge amount of computations is required to perform adaptation. The previously developed solutions for calculation of an approximate pre-image of the adapted model do not necessarily lead to the optimal conditions. Therefore, in this paper, we propose an efficient solution for this problem to construct more reliable pre-image of the adapted model in the speaker space. For this purpose, we benefit from the latent variable model to define a probabilistic model for description of the applied mapping between the kernel eigenspace and the speaker space. The experiments were conducted on two speech databases: FARSDAT, a Persian, and TIMIT, an English speech database. Implementing a typical HMM-based automatic speech recognition system, it was verified that the proposed method, utilizing about three seconds of adaptation data, achieves up to 4.4% and 7.6% relative phoneme recognition accuracy rate over the speaker-independent model on FARSDAT and TIMIT, respectively. Moreover, the proposed approach demonstrated superior performance compared to the other kernel eigenspace-based adaptation methods.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.