Unsupervised Model Adaptation for End-to-End ASR

Ganesh Sivaraman,Elie Khoury,Ricardo Casal,Matt Garland

doi:10.1109/icassp43922.2022.9746188

Abstract

End-to-end (E2E) Automatic Speech Recognition (ASR) systems are widely applied in various devices and communication domains. However, state-of-the-art ASR systems are known to underperform when there is a mismatch in the training and test domains. As a result, acoustic models deployed in production are often adapted to the target domain to improve accuracy. This paper proposes a method to perform unsupervised model adaptation for E2E ASR using first-pass transcriptions of adaptation data produced by the baseline ASR model itself. The paper proposes two transcription confidence measures that can be used to select an optimal in-domain adaptation set. Experiments were performed using the Quartznet ASR architecture on the HarperValleyBank corpus. Results show that the unsupervised adaptation technique with the confidence measure based data selection results in a 8% absolute reduction in word error rate on the HarperValleyBank test set. The proposed method can be applied to any E2E ASR system and is suitable for model adaptation on call center audio with little to no manual transcription.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Unsupervised Model Adaptation for End-to-End ASR

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Theoretical Analysis of Diversity in an Ensemble of Automatic Speech Recognition Systems
Kartik Audhkhasi ... Shrikanth S Narayanan
IEEE/ACM Transactions on Audio, Speech, and Language Processing | VOL. 22
Kartik Audhkhasi, et. al.Kartik Audhkhasi ... Shrikanth S Narayanan
01 Mar 2014
IEEE/ACM Transactions on Audio, Speech, and Language Processing | VOL. 22

Confidence Score Based Conformer Speaker Adaptation for Speech Recognition
Jiajun Deng ... Boyang Xue
-
Jiajun Deng, et. al.Jiajun Deng ... Boyang Xue
18 Sep 2022
18 Sep 2022

Unsupervised Adaptation of Categorical Prosody Models for Prosody Labeling and Speech Recognition
S Ananthakrishnan ... S Narayanan
IEEE Transactions on Audio, Speech, and Language Processing | VOL. 17
S Ananthakrishnan, et. al.S Ananthakrishnan ... S Narayanan
01 Jan 2009
IEEE Transactions on Audio, Speech, and Language Processing | VOL. 17

Improving Low Resource Code-Switched ASR Using Augmented Code-Switched TTS
Yash Sharma ... Karan Taneja
-
Yash Sharma, et. al.Yash Sharma ... Karan Taneja
25 Oct 2020
25 Oct 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Unsupervised Model Adaptation for End-to-End ASR

Abstract

Talk to us

Similar Papers