An active approach to speaker and task adaptation based on automatic analysis of vocabulary confusability

Qiang Huo,Wei Li

doi:10.21437/interspeech.2007-127

Abstract

In the field of automatic speech recognition (ASR), speaker and task adaptation has become a very important research topic in recent years. Although recognition performance obtained by a speaker independent (SI) ASR system can be sufficiently good for many tasks, there still exists a large performance gap between an SI system and a speaker dependent (SD) ASR system. Although a, number of successful adaptation techniques have been developed, which modify speaker independent model parameters to favor the speaker and task with given adaptation data, little attention has been given to designing the adaption script effectively and efficiently to make the adapted system benefit most from the data. I have therefore chosen this aspect as the research topic in this thesis. In recent years, the idea of active learning has been applied to several ASR applications. Given the large amount of unlabelled training data, manual annotation efforts can be reduced by intelligently selecting only a subset of training data for manual labelling most useful for learning purposes of building up the application at hand. In this thesis, the concept of active learning is extended to the scenario of supervised speaker and task adaptation, where the system takes the initiative of eliciting small (ideally minimum) amount of adaptation data from the user for achieving high (ideally maximum) performance improvement by adapting the HMMs using the elicited adaptation data. Based on the concept of active learning, the task vocabulary confusability, which is highly related to the difficulty of the given task, is analyzed by using a new DTW-based HMM dissimilarity measure. The adaptation script is then generated effectively according to the vocabulary confusability based information. In this thesis, the adaptation script generation problem is cast as two constrained optimization problems with the same constraints but different objective functions. The first problem is maximum coverage problem with a Knapsack constraint problem. The second is nonlinear binary optimization problem with linear constraints. Two new approaches, namely, rank predicted pseudo-greedy approach and variable-depth approach with gradient projection guidance, are proposed to resolve these two optimization problems respectively. The active approach with the efficient adaptation script generation by using these two new approaches can generate the adaptation script much faster than the traditional approach without sacrificing recognition performance. Comparative experiments are designed and conducted for a simple application scenario involving searching an item from a long list via voice. The experimental results demonstrate that the proposed active adaptation strategy performs much better than traditional passive adaptation strategies.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

An active approach to speaker and task adaptation based on automatic analysis of vocabulary confusability

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

A study of an active approach to speaker and task adaptation based on automatic analysis of vocabulary confusability
Wei Li
-
Wei LiWei Li
25 Apr 2012
25 Apr 2012

Analysis on MAP and MLLR based speaker adaptation techniques in speech recognition
T Ramya ... S Lilly Christina
-
T Ramya, et. al.T Ramya ... S Lilly Christina
01 Mar 2014
01 Mar 2014

Confidence Score Based Speaker Adaptation of Conformer Speech Recognition Systems
Jiajun Deng ... Xunying Liu
IEEE/ACM Transactions on Audio, Speech, and Language Processing | VOL. 31
Jiajun Deng, et. al.Jiajun Deng ... Xunying Liu
01 Jan 2023
IEEE/ACM Transactions on Audio, Speech, and Language Processing | VOL. 31

Hierarchical Bayesian combination of plug-in maximum a posteriori decoders in deep neural networks-based speech recognition and speaker adaptation
Zhen Huang ... Chin-Hui Lee
Pattern Recognition Letters | VOL. 98
Zhen Huang, et. al.Zhen Huang ... Chin-Hui Lee
04 Aug 2017
Pattern Recognition Letters | VOL. 98

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

An active approach to speaker and task adaptation based on automatic analysis of vocabulary confusability

Abstract

Talk to us

Similar Papers