Neural Cascade Architecture with Triple-domain Loss for Speech Enhancement.

Heming Wang,Deliang Wang

doi:10.1109/taslp.2021.3138716

Neural Cascade Architecture with Triple-domain Loss for Speech Enhancement.

Heming Wang, Deliang Wang

Open Access

https://doi.org/10.1109/taslp.2021.3138716

Copy DOI

Export

Save

Cite

Journal: IEEE/ACM transactions on audio, speech, and language processing	Publication Date: Jan 1, 2022
Citations: 14

Affiliation: The Ohio State University

#Speech Enhancement #Complex Spectrogram #Cascade Architecture #Time-domain Signal #Objective Speech Intelligibility #Neural Architecture #Domains Of Representation #Prediction Of Target #Strong Baselines #Speech Intelligibility

Abstract
Full-Text
Similar Papers

Abstract

Listen

This paper proposes a neural cascade architecture to address the monaural speech enhancement problem. The cascade architecture is composed of three modules which optimize in turn enhanced speech with respect to the magnitude spectrogram, the time-domain signal and the complex spectrogram. Each module takes as input the noisy speech and the output obtained from the previous module, and generates a prediction of the respective target. Our model is trained in an end-to-end manner, using a triple-domain loss function that accounts for three domains of signal representation. Experimental results on the WSJ0 SI-84 corpus show that the proposed model outperforms other strong speech enhancement baselines in terms of objective speech quality and intelligibility.

Full Text

Accepted Version

View

Published Version

Check institute access

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Similar Papers

Paper Title

Journal

Date

Author

View more papers

More From: IEEE/ACM transactions on audio, speech, and language processing

Paper Title

Journal

Date

Author

View more papers

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.

R Discovery Prime

Neural Cascade Architecture with Triple-domain Loss for Speech Enhancement.