A lightweight speech recognition method with target-swap knowledge distillation for Mandarin air traffic control communications

Jin Ren,Yihua Shi,Shunzhi Yang,Jinfeng Yang

doi:10.7717/peerj-cs.1650

Abstract

Miscommunications between air traffic controllers (ATCOs) and pilots in air traffic control (ATC) may lead to catastrophic aviation accidents. Thanks to advances in speech and language processing, automatic speech recognition (ASR) is an appealing approach to prevent misunderstandings. To allow ATCOs and pilots sufficient time to respond instantly and effectively, the ASR systems for ATC must have both superior recognition performance and low transcription latency. However, most existing ASR works for ATC are primarily concerned with recognition performance while paying little attention to recognition speed, which motivates the research in this article. To address this issue, this article introduces knowledge distillation into the ASR for Mandarin ATC communications to enhance the generalization performance of the light model. Specifically, we propose a simple yet effective lightweight strategy, named Target-Swap Knowledge Distillation (TSKD), which swaps the logit output of the teacher and student models for the target class. It can mitigate the potential overconfidence of the teacher model regarding the target class and enable the student model to concentrate on the distillation of knowledge from non-target classes. Extensive experiments are conducted to demonstrate the effectiveness of the proposed TSKD in homogeneous and heterogeneous architectures. The experimental results reveal that the generated lightweight ASR model achieves a balance between recognition accuracy and transcription latency.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A lightweight speech recognition method with target-swap knowledge distillation for Mandarin air traffic control communications

Abstract

Talk to us

Similar Papers

More From: PeerJ Computer Science

Lead the way for us

Journal: PeerJ Computer Science	Publication Date: Nov 1, 2023
License type: CC BY-NC 4.0

Similar Papers

Combined speech enhancement and auditory modelling for robust distributed speech recognition
Ronan Flynn ... Edward Jones
Speech Communication | VOL. 50
Ronan Flynn, et. al.Ronan Flynn ... Edward Jones
20 May 2008
Speech Communication | VOL. 50

Real-time Controlling Dynamics Sensing in Air Traffic System.
Yi Lin ... Bo Yang
Sensors | VOL. 19
Yi Lin, et. al.Yi Lin ... Bo Yang
07 Feb 2019
Sensors | VOL. 19

Evaluation of Speech Engines for ATC Simulator
Priyanka Bhandia ... S.K Srivatsa
-
Priyanka Bhandia, et. al.Priyanka Bhandia ... S.K Srivatsa
01 Dec 2018
01 Dec 2018

Error Correction of ASR in Air Traffic Control
Duo-Duo Hang ... Yi Yang
-
Duo-Duo Hang, et. al.Duo-Duo Hang ... Yi Yang
09 Dec 2022
09 Dec 2022

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A lightweight speech recognition method with target-swap knowledge distillation for Mandarin air traffic control communications

Abstract

Talk to us

Similar Papers

More From: PeerJ Computer Science