Improving Noise Robust Automatic Speech Recognition with Single-Channel Time-Domain Enhancement Network

Keisuke Kinoshita,Marc Delcroix,Tomohiro Nakatani,Tsubasa Ochiai

doi:10.1109/icassp40776.2020.9053266

Abstract

With the advent of deep learning, research on noise-robust automatic speech recognition (ASR) has progressed rapidly. However, ASR performance in noisy conditions of single-channel systems remains unsatisfactory. Indeed, most single-channel speech enhancement (SE) methods (denoising) have brought only limited performance gains over state-of-the-art ASR back-end trained on multi-condition training data. Recently, there has been much research on neural network-based SE methods working in the time-domain showing levels of performance never attained before. However, it has not been established whether the high enhancement performance achieved by such time-domain approaches could be translated into ASR. In this paper, we show that a single-channel time-domain denoising approach can significantly improve ASR performance, providing more than 30 % relative word error reduction over a strong ASR back-end on the real evaluation data of the single-channel track of the CHiME-4 dataset. These positive results demonstrate that single-channel noise reduction can still improve ASR performance, which should open the door to more research in that direction.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Improving Noise Robust Automatic Speech Recognition with Single-Channel Time-Domain Enhancement Network

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

A Time Domain Progressive Learning Approach with SNR Constriction for Single-Channel Speech Enhancement and Recognition
Zhaoxu Nian ... Jun Du
-
Zhaoxu Nian, et. al.Zhaoxu Nian ... Jun Du
23 May 2022
23 May 2022

Harmonicity Based Dereverberation for Improving Automatic Speech Recognition Performance and Speech Intelligibility
K Kinoshita
IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences | VOL. E88-A
K KinoshitaK Kinoshita
01 Jul 2005
IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences | VOL. E88-A

Combined speech enhancement and auditory modelling for robust distributed speech recognition
Ronan Flynn ... Edward Jones
Speech Communication | VOL. 50
Ronan Flynn, et. al.Ronan Flynn ... Edward Jones
20 May 2008
Speech Communication | VOL. 50

A Joint Speech Enhancement and Self-Supervised Representation Learning Framework for Noise-Robust Speech Recognition
Qiu-Shi Zhu ... Jie Zhang
IEEE/ACM Transactions on Audio, Speech, and Language Processing | VOL. 31
Qiu-Shi Zhu, et. al.Qiu-Shi Zhu ... Jie Zhang
01 Jan 2023
IEEE/ACM Transactions on Audio, Speech, and Language Processing | VOL. 31

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Improving Noise Robust Automatic Speech Recognition with Single-Channel Time-Domain Enhancement Network

Abstract

Talk to us

Similar Papers