Speaker Adaptation Using i-Vector Based Clustering

Min Soo Kim ,Gil-Jin Jang ,Ji-Hwan Kim ,Minho Lee

doi:10.3837/tiis.2020.07.003

Abstract

We propose a novel speaker adaptation method using acoustic model clustering. The similarity of different speakers is defined by the cosine distance between their i-vectors (intermediate vectors), and various efficient clustering algorithms are applied to obtain a number of speaker subsets with different characteristics. The speaker-independent model is then retrained with the training data of the individual speaker subsets grouped by the clustering results, and an unknown speech is recognized by the retrained model of the closest cluster. The proposed method is applied to a large-scale speech recognition system implemented by a hybrid hidden Markov model and deep neural network framework. An experiment was conducted to evaluate the word error rates using Resource Management database. When the proposed speaker adaptation method using i-vector based clustering was applied, the performance, as compared to that of the conventional speaker-independent speech recognition model, was improved relatively by as much as 12.2% for the conventional fully neural network, and by as much as 10.5% for the bidirectional long short-term memory.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Speaker Adaptation Using i-Vector Based Clustering

Abstract

Talk to us

Similar Papers

More From: KSII Transactions on Internet and Information Systems

Lead the way for us

Similar Papers

Mariana
Yongqiang Zou ... Bin Xiao
Proceedings of the VLDB Endowment | VOL. 7
Yongqiang Zou, et. al.Yongqiang Zou ... Bin Xiao
01 Aug 2014
Proceedings of the VLDB Endowment | VOL. 7

On the Benefits of Convolutional Neural Network Combinations in Offline Handwriting Recognition
Dewi Suryani ... Patrick Doetsch
-
Dewi Suryani, et. al.Dewi Suryani ... Patrick Doetsch
01 Oct 2016
01 Oct 2016

Hybrid Hidden Markov Model and Artificial Neural Network for Automatic Speech Recognition
Xian Tang
-
Xian TangXian Tang
01 May 2009
01 May 2009

A Multi-Target SNR-Progressive Learning Approach to Regression Based Speech Enhancement
Yan-Hui Tu ... Chin-Hui Lee
IEEE/ACM Transactions on Audio, Speech, and Language Processing | VOL. 28
Yan-Hui Tu, et. al.Yan-Hui Tu ... Chin-Hui Lee
01 Jan 2020
IEEE/ACM Transactions on Audio, Speech, and Language Processing | VOL. 28

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Speaker Adaptation Using i-Vector Based Clustering

Abstract

Talk to us

Similar Papers

More From: KSII Transactions on Internet and Information Systems