Denoising-and-Dereverberation Hierarchical Neural Vocoder for Robust Waveform Generation

Yang Ai,Haoyu Li,Xin Wang,Junichi Yamagishi,Zhenhua Ling

doi:10.1109/slt48900.2021.9383611

Abstract

This paper presents a denoising and dereverberation hierarchical neural vocoder (DNR-HiNet) to convert noisy and reverberant acoustic features into a clean speech waveform. We implement it mainly by modifying the amplitude spectrum predictor (ASP) in the original HiNet vocoder. This modified denoising and dereverberation ASP (DNR-ASP) can predict clean log amplitude spectra (LAS) from input degraded acoustic features. To achieve this, the DNR-ASP first predicts the noisy and reverberant LAS, noise LAS related to the noise information, and room impulse response related to the reverberation information then performs initial denoising and dereverberation. The initial processed LAS are then enhanced by another neural network as the final clean LAS. To further improve the quality of the generated clean LAS, we also introduce a bandwidth extension model and frequency resolution extension model in the DNR-ASP. The experimental results indicate that the DNR-HiNet vocoder was able to generate a denoised and dereverberated waveform given noisy and reverberant acoustic features and outperformed the original HiNet vocoder and a few other neural vocoders. We also applied the DNR-HiNet vocoder to speech enhancement tasks, and its performance was competitive with several advanced speech enhancement methods.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Denoising-and-Dereverberation Hierarchical Neural Vocoder for Robust Waveform Generation

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Knowledge-and-Data-Driven Amplitude Spectrum Prediction for Hierarchical Neural Vocoders
Yang Ai ... Zhen-Hua Ling
-
Yang Ai, et. al.Yang Ai ... Zhen-Hua Ling
25 Oct 2020
25 Oct 2020

A Neural Vocoder With Hierarchical Generation of Amplitude and Phase Spectra for Statistical Parametric Speech Synthesis
Yang Ai ... Zhen-Hua Ling
IEEE/ACM Transactions on Audio, Speech, and Language Processing | VOL. 28
Yang Ai, et. al.Yang Ai ... Zhen-Hua Ling
01 Jan 2020
IEEE/ACM Transactions on Audio, Speech, and Language Processing | VOL. 28

Estimating attenuation and the relative information content of amplitude and phase spectra
James Rickett
GEOPHYSICS | VOL. 72
James RickettJames Rickett
01 Jan 2007
GEOPHYSICS | VOL. 72

Real-time automatic small infrared target detection using local spectral filtering in the frequency
Hao Chen ... Jiafeng Li
-
Hao Chen, et. al.Hao Chen ... Jiafeng Li
04 Nov 2014
04 Nov 2014

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Denoising-and-Dereverberation Hierarchical Neural Vocoder for Robust Waveform Generation

Abstract

Talk to us

Similar Papers