Abstract

The quality and intelligibility of the speech are usually impaired by the interference of background noise when using internet voice calls. To solve this problem in the context of wearable smart devices, this paper introduces a dual-microphone, bone-conduction (BC) sensor assisted beamformer and a simple recurrent unit (SRU)-based neural network postfilter for real-time speech enhancement. Assisted by the BC sensor, which is insensitive to the environmental noise compared to the regular air-conduction (AC) microphone, the accurate voice activity detection (VAD) can be obtained from the BC signal and incorporated into the adaptive noise canceller (ANC) and adaptive block matrix (ABM). The SRU-based postfilter consists of a recurrent neural network with a small number of parameters, which improves the computational efficiency. The sub-band signal processing is designed to compress the input features of the neural network, and the scale-invariant signal-to-distortion ratio (SI-SDR) is developed as the loss function to minimize the distortion of the desired speech signal. Experimental results demonstrate that the proposed real-time speech enhancement system provides significant speech sound quality and intelligibility improvements for all noise types and levels when compared with the AC-only beamformer with a postfiltering algorithm.

Highlights

  • In recent years, the signal transmission bandwidth and network technology have been significantly improved, and communication system can real-time transmit speech signals with a higher sampling rate and deeper sampling bit depth

  • In the non-stationary noiseTo sound field,a real-time simple recurrent unit (SRU)-based neural network noise reduction algorithm with a small when the noise is a non-point source or the noise source is in the same direction with the desired signal, number parameters, this paper the obtaining array postfiltering as athe single-channel noise provided reduction the array of postfiltering algorithm has treats difficulty benefit from spatial information task

  • It can be seen that the deep neural network (DNN)-based method is significantly better than the traditional optimally modified log-spectral amplitude (OMLSA) method, because the conventional method depends on the assumption that noise changes much more slowly than speech

Read more

Summary

Introduction

The signal transmission bandwidth and network technology have been significantly improved, and communication system can real-time transmit speech signals with a higher sampling rate and deeper sampling bit depth. In the non-stationary noiseTo sound field,a real-time simple recurrent unit (SRU)-based neural network noise reduction algorithm with a small when the noise is a non-point source or the noise source is in the same direction with the desired signal, number parameters, this paper the obtaining array postfiltering as athe single-channel noise provided reduction the array of postfiltering algorithm has treats difficulty benefit from spatial information task. With the assistance single-channel noise reduction system based on SRU was used as a postfilter to eliminate the residual of the BC signal, the noise suppression capability of the dual-microphone GSC system under low noise in the microphone array output.

Signal Model
BC Signal
BC VAD Estimator
Robust Generalized Sidelobe Canceller
Adaptive
Robust Compensation Filter for Low Frequencies
Wind Noise Suppression
Feature Compression
Learning
System Processing Pipeline
Performance Metrics
Comparisons
Experimental Setting
Performance Evaluation of Postfiltering
Performance Evaluation of BC–GSC
Performance Evaluation of the Proposed Speech Enhancement Algorithm
Evaluation the ProposedOMLSA
Conclusions
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.