First coarse, fine afterward: A lightweight two-stage complex approach for monaural speech enhancement

Feng Dang,Hangting Chen,Qi Hu,Pengyuan Zhang,Yonghong Yan

doi:10.1016/j.specom.2022.11.004

Feng Dang, Hangting Chen + Show 3 more

Open Access

https://doi.org/10.1016/j.specom.2022.11.004

Copy DOI

Export

Save

Cite

Abstract
Full-Text
Similar Papers

Abstract

Listen

Deep neural network-based speech enhancement systems have achieved promising results. However, the state-of-the-art (SOTA) models usually have too many parameters and require too much computational work to be used on devices for practical applications. In this paper, we propose a novel lightweight complex spectral mask-based neural network with a two-stage pipeline for monaural speech enhancement. The network utilizes the idea of decoupling a primary problem into several simple sub-problems, which reduces the computational burden and model parameters. Specifically, the network contains two mask-based sub-networks, i.e., CoarseNet, and FineNet, implemented in the complex domain to improve the enhancement performances progressively. The CoarseNet takes the coarse-grained compact features as input and estimates the corresponding full-band complex mask. The FineNet focuses on further removing residual noises in the low-frequency components of CoarseNet output by predicting a fine-grained mask. The transforms between coarse- and fine-scale are based on a novel learnable complex-valued rectangular bandwidth (LCRB) filter bank. Furthermore, we also propose a lightweight and general complex-valued attention mechanism to improve the modeling capability of convolutional encoder/decoder of the network and uses cross-stage skip connections (CSSC) between sub-networks to facilitate information flowing between sub-networks. Extensive experiments on two standard corpora demonstrate that our proposed approach achieves better performances over previous SOTA systems under various conditions while maintaining relatively small model sizes and low computational complexity.

Full Text

Published Version

View

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

First coarse, fine afterward: A lightweight two-stage complex approach for monaural speech enhancement

Abstract

Published Version

Talk to us

Similar Papers

More From: Speech Communication

Lead the way for us

Journal: Speech Communication	Publication Date: Nov 25, 2022
Citations: 5

Similar Papers

Performance analysis of neural network, NMF and statistical approaches for speech enhancement
Ravi Kumar Kandagatla ... Venkata Subbaiah Potluri
International Journal of Speech Technology | VOL. 23
Ravi Kumar Kandagatla, et. al.Ravi Kumar Kandagatla ... Venkata Subbaiah Potluri
17 Sep 2020
International Journal of Speech Technology | VOL. 23

A Survey on Low-Latency DNN-Based Speech Enhancement.
Szymon Drgas
Sensors | VOL. 23
Szymon DrgasSzymon Drgas
26 Jan 2023
Sensors | VOL. 23

Sergan: Speech Enhancement Using Relativistic Generative Adversarial Networks with Gradient Penalty
Deepak Baby ... Sarah Verhulst
-
Deepak Baby, et. al.Deepak Baby ... Sarah Verhulst
01 May 2019
01 May 2019

A lightweight white blood cells detection network based on CenterNet and feature fusion modules
Lianghong Wu ... Cili Zuo
Measurement Science and Technology | VOL. 35
Lianghong Wu, et. al.Lianghong Wu ... Cili Zuo
24 Apr 2024
Measurement Science and Technology | VOL. 35

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

First coarse, fine afterward: A lightweight two-stage complex approach for monaural speech enhancement

Abstract

Published Version

Talk to us

Similar Papers

More From: Speech Communication