Noise-Separated Adaptive Feature Distillation for Robust Speech Recognition

Honglin Qu,Xiangdong Su,Xiang Hao,Yonghe Wang,Guanglai Gao

doi:10.1109/lsp.2023.3289110

Abstract

This letter makes an improvement on feature-based knowledge distillation for robust speech recognition. The use of distillation techniques in speech recognition has been demonstrated to improve the robustness of the system. In this letter, we propose a noise-separated adaptive feature distillation method, including an adaptive distillation position selection strategy and a noise separation mechanism, assuming that there is a common network structure between the student and teacher. The proposed method has two improvements. First, distillation positions can be adaptively selected in each iteration by comparing loss values computed on intermediate representations of the student and the teacher, increasing the flexibility of knowledge transfer during distillation. Second, a noise separation module is proposed to constrain noise information elimination by explicitly separating the speech information and the noise information in noisy speech, which reduces the interference of noise information during distillation. Therefore, a better recognition performance is demonstrated with the proposed method compared to the standardized feature-based knowledge distillation method.

Full Text