An Attention-based Neural Network Approach for Single Channel Speech Enhancement

Xiang Hao,Lei Xie,Sining Sun,Yong Xu,Changhao Shan

doi:10.1109/icassp.2019.8683169

Abstract

This paper proposes an attention-based neural network approach for single channel speech enhancement. Our work is inspired by the recent success of attention models in sequence-to-sequence learning. It is intuitive to use attention mechanism in speech enhancement as humans are able to focus on the important speech components in an audio stream with "high attention" while perceiving the unimportant region (e.g., noise or interference) in "low attention", and thus adjust the focal point over time. Specifically, taking noisy spectrum as input, our model is composed of an LSTM based encoder, an attention mechanism and a speech generator, resulting in enhanced spectrum. Experiments show that, as compared with OM-LSA and the LSTM baseline, the proposed attention approach can consistently achieve better performance in terms of speech quality (PESQ) and intelligibility (STOI). More promisingly, the attention-based approach has better generalization ability to unseen noise conditions.

Full Text