Performance analysis of a dilated attention fast GAN for speech enhancement

Vahid Ashkani,Vijay Parsa

doi:10.1121/10.0027747

Abstract

Recent advancements in speech enhancement have witnessed the emergence of generator-based methodologies. However, several of these approaches exhibit complexity in handling input variations, either excelling at low signal-to-noise ratios (SNRs) by utilizing intricate representations of noisy and clean speech or demonstrating superior performance only at higher SNRs. In this work, we investigated speech enhancement using a Dilated Attention Fast Generative Adversarial Network (DAF-GAN). The proposed DAF-GAN framework achieves stability in performance across different SNR conditions by efficiently processing large-scale signal lengths. The DFS-GAN features a dilated discriminator model operating via patches. The generator architecture incorporates multi-decoding and attention gates facilitated through skip-connections, strategically integrated within the Fast-U-Net model to optimize processing speed. An ideal ratio mask was used in the test phase to further refine the enhanced signal by emphasizing target speech while suppressing residual noise or artifacts. The DAF-GAN performance was assessed using objective metrics such as PESQ on a number of noisy speech databases. Results revealed that the DAF-GAN performed modestly in comparison with the state-of-the-art models. For example, analyses of the VoiceBank-DEMAND dataset yielded a PESQ score of 2.50 for the DAF-GAN.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Performance analysis of a dilated attention fast GAN for speech enhancement

Abstract

Talk to us

Similar Papers

More From: The Journal of the Acoustical Society of America

Lead the way for us

Similar Papers

Speech Denoising via Low-Rank and Sparse Matrix Decomposition
Jianjun Huang
ETRI Journal | VOL. 36
Jianjun HuangJianjun Huang
01 Feb 2014
ETRI Journal | VOL. 36

Deep Learning for Minimum Mean-Square Error and Missing Data Approaches to Robust Speech Processing

-

04 Dec 2020
04 Dec 2020

Kalman Filtering with Machine Learning Methods for Speech Enhancement

-

04 May 2021
04 May 2021

Contextual deep learning-based audio-visual switching for speech enhancement in real-world environments
Ahsan Adeel ... Amir Hussain
Information Fusion | VOL. 59
Ahsan Adeel, et. al.Ahsan Adeel ... Amir Hussain
19 Aug 2019
Information Fusion | VOL. 59

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Performance analysis of a dilated attention fast GAN for speech enhancement

Abstract

Talk to us

Similar Papers

More From: The Journal of the Acoustical Society of America