Speech Enhancement Method Research Articles

Supervised models for speech enhancement are trained using artificially generated mixtures of clean speech and noise signals. However, the synthetic training conditions may not accurately reflect real-world conditions encountered during testing. This discrepancy can result in poor performance when the test domain significantly differs from the synthetic training domain. To tackle this issue, the UDASE task of the 7th CHiME challenge aimed to leverage real-world noisy speech recordings from the test domain for unsupervised domain adaptation of speech enhancement models. Specifically, this test domain corresponds to the CHiME-5 dataset, characterized by real multi-speaker and conversational speech recordings made in noisy and reverberant domestic environments, for which ground-truth clean speech signals are not available. In this paper, we present the objective and subjective evaluations of the systems that were submitted to the CHiME-7 UDASE task, and we provide an analysis of the results. This analysis reveals a limited correlation between subjective ratings and several supervised nonintrusive performance metrics recently proposed for speech enhancement. Conversely, the results suggest that more traditional intrusive objective metrics can be used for in-domain performance evaluation using the reverberant LibriCHiME-5 dataset developed for the challenge. The subjective evaluation indicates that all systems successfully reduced the background noise, but always at the expense of increased distortion. Out of the four speech enhancement methods evaluated subjectively, only one demonstrated an improvement in overall quality compared to the unprocessed noisy speech, highlighting the difficulty of the task. The tools and audio material created for the CHiME-7 UDASE task are shared with the community.

Read full abstract

Speech communicated is adversely affected by environmental noise. It is important to process the speech and reduce noise for better understanding. Speech enhancement or noise reduction is useful to provide comfort for human or machine listening. Traditional algorithms provide better noise reduction and better-quality speech. Due to the non-stationary nature of noise and the quasi-stationary nature of speech, the traditional methods are proven inadequate in achieving high-quality speech. Later statistical estimators based on Gaussian, and super-Gaussian Probability Density Function (PDF) assumption further improved the enhancement performance. But still, non-stationary noise nature introduces artifacts in processed signal and results in decreased performance. It is observed that neural network approaches and the factorization approach provide better performance even under non-stationary noises by proper training and large database. Different features result in variations in output performance under unseen noise and speaker conditions. It is important to understand the importance and advantages of traditional methods, statistical estimators, and neural network approaches performances. To select the suitable method for a required application, it is essential to consider the trade-off between quality and distortion. In this work, the importance of speech enhancement methods is discussed. Performance measures used for understanding the speech enhancement like Signal to Noise Ratio (SNR), Segmental SNR, Log-Likelihood Ratio (LLR), Weighted Spectral Slope (WSS), Perceptual Evaluation of Speech Quality (PESQ), Short-Time Objective Intelligibility (STOI), Signal to Distortion Ratio (SDR) and Mean Opinion Score (MOS), are given. Highlights of important results are discussed for analyzing better speech enhancement methods for the required application. In this work, performance is compared using objective and subjective performance measures. Simulation results show superior performance when neural network is employed in statistical estimators.

Read full abstract

Speech Enhancement Method Research Articles

Related Topics

Articles published on Speech Enhancement Method

From simulation to reality: tackling data mismatches in speech enhancement with unsupervised pre-training

Deep-learning-based speech enhancement under rough-focus conditions with optical laser microphone

Deep autoencoder with gated convolutional neural networks for improving speech quality in secured communications

A dual-region speech enhancement method based on voiceprint segmentation

A neural network approach for speech enhancement and noise-robust bandwidth extension

Assessment of Self-Supervised Denoising Methods for Esophageal Speech Enhancement

Air Traffic Control Speech Enhancement Method Based on Improved DNN-IRM

Objective and subjective evaluation of speech enhancement methods in the UDASE task of the 7th CHiME challenge

Using deep learning to improve the intelligibility of a target speaker in noisy multi-talker environments for people with normal hearing and hearing loss.

Weak abnormal acoustic signal enhancement and recognition using squeeze-and-excitation attention based denoising convolutional neural network during high-dam flood discharging

PDenoiser: A Personalized Speech Enhancement Neural Network for Pre-hospital Emergency Medical Services.

Supervised Attention Multi-Scale Temporal Convolutional Network for monaural speech enhancement

INCREASING ROBUSTNESS OF I-VECTORS VIA MASKING: A CASE STUDY IN SYNTHETIC SPEECH DETECTION

Stacked Multiscale Densely Connected Temporal Convolutional Attention Network for Multi-Objective Speech Enhancement in an Airborne Environment

Analysis of statistical estimators and neural network approaches for speech enhancement

Towards an Environmentally Robust Speech Assistant System for Emergency Medical Services.

Decoupling-style monaural speech enhancement with a triple-branch cross-domain fusion network

A Two-Stage Deep Representation Learning-Based Speech Enhancement Method Using Variational Autoencoder and Adversarial Training

A speech enhancement method combining beamforming with RNN for hearing aids

Robust Spoofed Speech Detection with Denoised I-vectors

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Speech Enhancement Method Research Articles

Related Topics

Articles published on Speech Enhancement Method

From simulation to reality: tackling data mismatches in speech enhancement with unsupervised pre-training

Deep-learning-based speech enhancement under rough-focus conditions with optical laser microphone

Deep autoencoder with gated convolutional neural networks for improving speech quality in secured communications

A dual-region speech enhancement method based on voiceprint segmentation

A neural network approach for speech enhancement and noise-robust bandwidth extension

Assessment of Self-Supervised Denoising Methods for Esophageal Speech Enhancement

Air Traffic Control Speech Enhancement Method Based on Improved DNN-IRM

Objective and subjective evaluation of speech enhancement methods in the UDASE task of the 7th CHiME challenge

Using deep learning to improve the intelligibility of a target speaker in noisy multi-talker environments for people with normal hearing and hearing loss.

Weak abnormal acoustic signal enhancement and recognition using squeeze-and-excitation attention based denoising convolutional neural network during high-dam flood discharging

PDenoiser: A Personalized Speech Enhancement Neural Network for Pre-hospital Emergency Medical Services.

Supervised Attention Multi-Scale Temporal Convolutional Network for monaural speech enhancement

INCREASING ROBUSTNESS OF I-VECTORS VIA MASKING: A CASE STUDY IN SYNTHETIC SPEECH DETECTION

Stacked Multiscale Densely Connected Temporal Convolutional Attention Network for Multi-Objective Speech Enhancement in an Airborne Environment

Analysis of statistical estimators and neural network approaches for speech enhancement

Towards an Environmentally Robust Speech Assistant System for Emergency Medical Services.

Decoupling-style monaural speech enhancement with a triple-branch cross-domain fusion network

A Two-Stage Deep Representation Learning-Based Speech Enhancement Method Using Variational Autoencoder and Adversarial Training

A speech enhancement method combining beamforming with RNN for hearing aids

Robust Spoofed Speech Detection with Denoised I-vectors