CST: Complex Sparse Transformer for Low-SNR Speech Enhancement.

Kaijun Tan,Wenyu Mao,Xiaozhou Guo,Huaxiang Lu,Chi Zhang,Zhanzhong Cao,Xingang Wang

doi:10.3390/s23052376

Abstract

Speech enhancement tasks for audio with a low SNR are challenging. Existing speech enhancement methods are mainly designed for high SNR audio, and they usually use RNNs to model audio sequence features, which causes the model to be unable to learn long-distance dependencies, thus limiting its performance in low-SNR speech enhancement tasks. We design a complex transformer module with sparse attention to overcome this problem. Different from the traditional transformer model, this model is extended to effectively model complex domain sequences, using the sparse attention mask balance model's attention to long-distance and nearby relations, introducing the pre-layer positional embedding module to enhance the model's perception of position information, adding the channel attention module to enable the model to dynamically adjust the weight distribution between channels according to the input audio. The experimental results show that, in the low-SNR speech enhancement tests, our models have noticeable performance improvements in speech quality and intelligibility, respectively.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Sensors (Basel, Switzerland)	Publication Date: Feb 21, 2023
Citations: 3	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

CST: Complex Sparse Transformer for Low-SNR Speech Enhancement.

Abstract

Talk to us

Similar Papers

More From: Sensors (Basel, Switzerland)

Lead the way for us

Similar Papers

DNN-based Speech Enhancement for Improving Speech Quality and Intelligibility Simultaneously
Ge Zhan ... Wenjing Wei
-
Ge Zhan, et. al.Ge Zhan ... Wenjing Wei
01 Dec 2019
01 Dec 2019

Enhancement of speech in noise using multi-channel, time-varying gains derived from the temporal envelope
Rahim Soleymanpour ... Insoo Kim
Applied Acoustics | VOL. 190
Rahim Soleymanpour, et. al.Rahim Soleymanpour ... Insoo Kim
24 Jan 2022
Applied Acoustics | VOL. 190

Formant Frequency-based Speech Enhancement Technique to improve Intelligibility for hearing aid users with smartphone as an assistive device.
Gautam S Bhat ... Nikhil Shankar
... Health innovations and point-of-care technologies conference. Health innovations and point-of-care technologies conference | VOL. 2017
Gautam S Bhat, et. al.Gautam S Bhat ... Nikhil Shankar
01 Nov 2017
... Health innovations and point-of-care technologies conference. Health innovations and point-of-care technologies conference | VOL. 2017

Binary mask based method for enhancement of mixed noise speech of low SNR input
Sachin Singh ... Manoj Tripathy
International Journal of Speech Technology | VOL. 18
Sachin Singh, et. al.Sachin Singh ... Manoj Tripathy
14 Sep 2015
International Journal of Speech Technology | VOL. 18

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

CST: Complex Sparse Transformer for Low-SNR Speech Enhancement.

Abstract

Talk to us

Similar Papers

More From: Sensors (Basel, Switzerland)