Using Separate Losses for Speech and Noise in Mask-Based Speech Enhancement

Ziyi Xu,Samy Elshamy,Tim Fingscheidt

doi:10.1109/icassp40776.2020.9052968

Abstract

Estimating time-frequency domain masks for speech enhancement using deep learning approaches has recently become a popular field in research. In this paper, we propose a novel components loss (CL) for the training of neural networks for speech enhancement. During the training process, the proposed CL offers separate control over suppression of the noise component and preservation of the speech component. We illustrate the potential of the proposed CL by example of a convolutional neural network (CNN) for mask-based speech enhancement. We show improvement in almost all employed instrumental quality metrics over the baseline losses, which comprises the conventional mean squared error (MSE) loss and also perceptual evaluation of speech quality (PESQ) loss. On average, more than 0.3 dB higher SNR improvement and an at least 0.1 points higher PESQ score on the speech component are obtained. In addition to that, a more naturally sounding residual noise and a consistently best PESQ on the enhanced speech is obtained. All improvements are more distinct at low SNR conditions.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Using Separate Losses for Speech and Noise in Mask-Based Speech Enhancement

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Components loss for neural networks in mask-based speech enhancement
Ziyi Xu ... Tim Fingscheidt
EURASIP Journal on Audio, Speech, and Music Processing | VOL. 2021
Ziyi Xu, et. al.Ziyi Xu ... Tim Fingscheidt
02 Jul 2021
EURASIP Journal on Audio, Speech, and Music Processing | VOL. 2021

RESIDUAL RECURRENT NEURAL NETWORK FOR SPEECH ENHANCEMENT.
Jalal Abdulbaqi ... Yue Gu
Proceedings of the ... IEEE International Conference on Acoustics, Speech, and Signal Processing. ICASSP (Conference) | VOL. 2020
Jalal Abdulbaqi, et. al.Jalal Abdulbaqi ... Yue Gu
01 May 2020
Proceedings of the ... IEEE International Conference on Acoustics, Speech, and Signal Processing. ICASSP (Conference) | VOL. 2020

Deep Time Delay Neural Network for Speech Enhancement with Full Data Learning
Cunhang Fan ... Zhengqi Wen
-
Cunhang Fan, et. al.Cunhang Fan ... Zhengqi Wen
24 Jan 2021
24 Jan 2021

Fusion of Amplitude and Complex Domains based on Deep Neural Networks for Speech Enhancement
Mohammad Saeed Deylami ... Sanaz Seyedin
-
Mohammad Saeed Deylami, et. al.Mohammad Saeed Deylami ... Sanaz Seyedin
04 Aug 2020
04 Aug 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Using Separate Losses for Speech and Noise in Mask-Based Speech Enhancement

Abstract

Talk to us

Similar Papers