Mutual-learning sequence-level knowledge distillation for automatic speech recognition

Zerui Li,Yue Ming,Lei Yang,Jing-Hao Xue

doi:10.1016/j.neucom.2020.11.025

Abstract

Automatic speech recognition (ASR) is a crucial technology for man-machine interaction. End-to-end models have been studied recently in deep learning for ASR. However, these models are not suitable for the practical application of ASR due to their large model sizes and computation costs. To address this issue, we propose a novel mutual-learning sequence-level knowledge distillation framework enjoying distinct student structures for ASR. Trained mutually and simultaneously, each student learns not only from the pre-trained teacher but also from its distinct peers, which can improve the generalization capability of the whole network, through making up for the insufficiency of each student and bridging the gap between each student and the teacher. Extensive experiments on the TIMIT and large LibriSpeech corpuses show that, compared with the state-of-the-art methods, the proposed method achieves an excellent balance between recognition accuracy and model compression.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Neurocomputing	Publication Date: Dec 11, 2020
Citations: 15	License type: other-oa

R Discovery Prime

R Discovery Prime

Mutual-learning sequence-level knowledge distillation for automatic speech recognition

Abstract

Talk to us

Similar Papers

More From: Neurocomputing

Lead the way for us

Similar Papers

Efficient Search Mechanism from Large Scale Corpora for Domain-Specific Language Modeling in Speech Recognition
-
International Journal of Engineering and Advanced Technology | VOL. 8
--
30 Aug 2019
International Journal of Engineering and Advanced Technology | VOL. 8

LSR-YOLO: A High-Precision, Lightweight Model for Sheep Face Recognition on the Mobile End.
Xiwen Zhang ... Yanhua Ma
Animals : an open access journal from MDPI | VOL. 13
Xiwen Zhang, et. al.Xiwen Zhang ... Yanhua Ma
31 May 2023
Animals : an open access journal from MDPI | VOL. 13

Research on Human-Computer Interaction Mode of Speech Recognition Based on Environment Elements of Command and Control System
Ning Li ... Yingwei Zhou
-
Ning Li, et. al.Ning Li ... Yingwei Zhou
01 Jul 2019
01 Jul 2019

ETEH: Unified Attention-Based End-to-End ASR and KWS Architecture
Gaofeng Cheng ... Haoran Miao
IEEE/ACM Transactions on Audio, Speech, and Language Processing | VOL. 30
Gaofeng Cheng, et. al.Gaofeng Cheng ... Haoran Miao
01 Jan 2021
IEEE/ACM Transactions on Audio, Speech, and Language Processing | VOL. 30

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Mutual-learning sequence-level knowledge distillation for automatic speech recognition

Abstract

Talk to us

Similar Papers

More From: Neurocomputing