Truth-to-Estimate Ratio Mask: A Post-Processing Method for Speech Enhancement Direct at Low Signal-to-Noise Ratios

Bohan Chen,He Wang,Yue Wei,Richard H.Y So

doi:10.1109/icassp40776.2020.9052919

Abstract

This study proposes a bi-directional recurrent neural network (Bi-RNN) post-processing method for speech enhancement (SE) at low signal-to noise ratios (SNR). Current speech enhancement solutions performed badly under low SNR situations. Loizou and Kim proposed a solution to reduce speech distortion errors in time-frequency (T-F) domain but it requires the knowledge of ground truth. As ground truth is unknown in real-life applications, the current study proposes to use a Bi-RNN to implement Loizou and Kim’s solution as a post-processing method for SE engines. Our solutions do not require prior knowledge of ground truth. The effectiveness of the proposed method is investigated with a spectral subtraction (SS) SE engine, a non-negative matrix factorization (NMF) SE engine, and a deep neural network ideal ratio mask (DNN-IRM) SE engine, under matched/mis-matched noise and different SNR conditions. Experimental results demonstrate that the proposed post-processing method effectively improved both perceptual evaluation of speech quality (PESQ) and short-time objective intelligibility (STOI) for all of these SE engines, especially at low SNR conditions.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Truth-to-Estimate Ratio Mask: A Post-Processing Method for Speech Enhancement Direct at Low Signal-to-Noise Ratios

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Performance analysis of neural network, NMF and statistical approaches for speech enhancement
Ravi Kumar Kandagatla ... Venkata Subbaiah Potluri
International Journal of Speech Technology | VOL. 23
Ravi Kumar Kandagatla, et. al.Ravi Kumar Kandagatla ... Venkata Subbaiah Potluri
17 Sep 2020
International Journal of Speech Technology | VOL. 23

Learning an Adversarial Network for Speech Enhancement Under Extremely Low Signal-to-Noise Ratio Condition
Xiangdong Su ... Feilong
-
Xiangdong Su, et. al.Xiangdong Su ... Feilong
01 Jan 2019
01 Jan 2019

Multiple modules speech enhancement in mixed noise and low SNR environments
Tian Lan ... Sen Li
-
Tian Lan, et. al.Tian Lan ... Sen Li
31 Dec 2020
31 Dec 2020

Acoustic bird species classification under low SNR and small-scale dataset conditions
Zhao Zhao ... Zhi-Yong Xu
Applied Acoustics | VOL. 214
Zhao Zhao, et. al.Zhao Zhao ... Zhi-Yong Xu
06 Oct 2023
Applied Acoustics | VOL. 214

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Truth-to-Estimate Ratio Mask: A Post-Processing Method for Speech Enhancement Direct at Low Signal-to-Noise Ratios

Abstract

Talk to us

Similar Papers