Speech Enhancement Performance Research Articles

Enhancing reverberant speech with Deep Neural Networks (DNNs) is an interesting yet challenging topic. The performance of speech enhancement degrades significantly when test and training conditions are mismatched. In this paper we propose a Static Reverberation Aware Training (SRAT)-based dereverberation through which the reverberation estimate is obtained by averaging over broken down frame. This method significantly reduces the input dimensions of the and enables the DNN to learn the relations between clean and reverberant speech more efficiently. Most speech enhancement approaches ignore phase information due to its complicated structure. As phase correlates closely to speech signal we exploited this relationship to achieve better performance using DNN. Phase information was augmented with magnitude information and used as the input for DNN. We denote this method as phase aware DNN. Finally, both phase information and reverberation were added to reverberant speech to achieve better speech enhancement performance in a distant-talking condition. Features of the reverberant speech, phase and reverberation were used during the training and testing stages. This is because the DNN could use both reverberation and phase information to better generalize the speech signal. The proposed method was evaluated using the REVERB CHALLENGE 2014 database. Results are significantly improved results with respect to both reconstructed speech quality (PESQ: Perceptual Evaluation of Speech Quality) and influence of reverberation (SRMR: Speech to Reverberation Modulation Energy Ratio). As compared to the conventional DNN-based approach, this proposed one improved SRMR from 4.84 to 5.92 and PESQ from 2.34 to 2.70, indicating that our proposed method could efficiently enhance speech severely corrupted by reverberation.

Speech enhancement algorithms play an important role in speech signal processing. Over the past several decades, many algorithms have been studied for speech enhancement. A speech enhancement algorithm uses a noise removal method and a statistical model filter to analyze the speech signal in the frequency domain. Spectral subtraction and Wiener filters have been used as representative algorithms. These algorithms have excellent speech enhancement performance, but suffer from deterioration in performance due to specific noise or low signal-to-noise ratio (SNR) environments. In addition, according to estimations of erroneous noise, a noise existing in a voice signal is maintained so that a spectrum corresponding to a voice signal is distorted, or a frame corresponding to a voice signal cannot be retrieved, and voice recognition performance deteriorates. The problem of deterioration in speech recognition performance arises from the difference between speech recognition and training model. We use silence-feature normalization model as a methodology to improve the recognition rate resulting from the difference in the noisy environments. Conventional silence-feature normalization has a problem in that the silent part of the energy increases, which affects recognition performance due to unclear boundaries categorizing the voice. In this study, we use the cepstrum feature of the noise signals in the silence-feature normalization model to improve the performance of silence-feature normalization in a signal with a low SNR by setting a reference value for voiced and unvoiced classification. As a result of recognition rate confirmation, the recognition rates improve in performance, compared with other methods.

Speech Enhancement Performance Research Articles

Related Topics

Articles published on Speech Enhancement Performance

Time-Domain Multi-Modal Bone/Air Conducted Speech Enhancement

Audio-Visual Speech Enhancement Using Conditional Variational Auto-Encoders

Embedding Encoder-Decoder With Attention Mechanism for Monaural Speech Enhancement

Speech enhancement based on noise classification and deep neural network

Deep learning for minimum mean-square error approaches to speech enhancement

Disentangled Feature Learning for Noise-Invariant Speech Enhancement

A Joint Learning Algorithm for Complex-Valued T-F Masks in Deep Learning-Based Single-Channel Speech Enhancement Systems

A Unified Convolutional Beamformer for Simultaneous Denoising and Dereverberation

An Iterative Kalman Filter with Reduced-Biased Kalman Gain for Single Channel Speech Enhancement in Non-stationary Noise Condition

A novel fast nonstationary noise tracking approach based on MMSE spectral power estimator

Improved Wasserstein conditional generative adversarial network speech enhancement

On Speech Enhancement Under PSD Uncertainty

Speech Enhancement Algorithm Based on Sound Source Localization and Scene Matching for Binaural Digital Hearing Aids

Single-channel speech enhancement using inter-component phase relations

Phase and reverberation aware DNN for distant-talking speech enhancement

Training and compensation of class-conditioned NMF bases for speech enhancement

Pipelined Architecture of Multi-Band Spectral Subtraction Algorithm for Speech Enhancement

Performance Evaluation of Silence-Feature Normalization Model using Cepstrum Features of Noise Signals

Speech and noise power estimation using Gamma modeling

Pitch pattern matching based speech enhancement

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Speech Enhancement Performance Research Articles

Related Topics

Articles published on Speech Enhancement Performance

Time-Domain Multi-Modal Bone/Air Conducted Speech Enhancement

Audio-Visual Speech Enhancement Using Conditional Variational Auto-Encoders

Embedding Encoder-Decoder With Attention Mechanism for Monaural Speech Enhancement

Speech enhancement based on noise classification and deep neural network

Deep learning for minimum mean-square error approaches to speech enhancement

Disentangled Feature Learning for Noise-Invariant Speech Enhancement

A Joint Learning Algorithm for Complex-Valued T-F Masks in Deep Learning-Based Single-Channel Speech Enhancement Systems

A Unified Convolutional Beamformer for Simultaneous Denoising and Dereverberation

An Iterative Kalman Filter with Reduced-Biased Kalman Gain for Single Channel Speech Enhancement in Non-stationary Noise Condition

A novel fast nonstationary noise tracking approach based on MMSE spectral power estimator

Improved Wasserstein conditional generative adversarial network speech enhancement

On Speech Enhancement Under PSD Uncertainty

Speech Enhancement Algorithm Based on Sound Source Localization and Scene Matching for Binaural Digital Hearing Aids

Single-channel speech enhancement using inter-component phase relations

Phase and reverberation aware DNN for distant-talking speech enhancement

Training and compensation of class-conditioned NMF bases for speech enhancement

Pipelined Architecture of Multi-Band Spectral Subtraction Algorithm for Speech Enhancement

Performance Evaluation of Silence-Feature Normalization Model using Cepstrum Features of Noise Signals

Speech and noise power estimation using Gamma modeling

Pitch pattern matching based speech enhancement