Abstract

We propose a novel approach to improve adaptive decorrelation filtering- (ADF-) based speech source separation in diffuse noise. The effects of noise on system adaptation and separation outputs are handled separately. First, fast noise compensation (NC) is developed for adaptation of separation filters, forcing ADF to focus on source separation; next, output noises are suppressed by speech enhancement. By tracking noise components in output cross-correlation functions, the bias effect of noise on the system adaptation objective function is compensated, and by adaptively estimating output noise autocorrelations, the speech separation output is enhanced. For fast noise compensation, a blockwise fast ADF (FADF) is implemented. Experiments were conducted on real and simulated diffuse noises. Speech mixtures were generated by convolving TIMIT speech sources with acoustic path impulse responses measured in a real room with reverberation time T60 = 0.3 second. The proposed techniques significantly improved separation performance and phone recognition accuracy of ADF outputs.

Highlights

  • Interference speech and diffuse noise present double folds of challenges for hands-free automatic speech recognition (ASR) and speech communication

  • For practical applications of blind source separation (BSS), it is important to address the effects of noise in speech separation: (1) noise may degrade the conditions of BSS and hurt the separation performances; (2) BSS aims at source separation and has limited ability in suppressing diffuse noise

  • K = N and the FFT length NF = 2N, the computation of 2Npoint FFTs is distributed to the block of length N, resulting in a complexity of O(log N) per time-sample for noise compensation (NC)-fast adaptive decorrelation filtering (ADF) (FADF), in contrast to O(N2) for a direct estimation of NC terms that are required by matrix-vector multiplications

Read more

Summary

INTRODUCTION

Interference speech and diffuse noise present double folds of challenges for hands-free automatic speech recognition (ASR) and speech communication. Speech enhancement algorithms that are formulated for stationary noises cannot be applied directly in this scenario, because the adaptation of separation filters makes the output noise statistics time varying. Such variation may happen frequently when the mixing acoustic paths change, for example when a speaker moves. In real sound fields, diffuse noises are colored and spatially correlated in low frequency which deteriorate ADF performance more severely than uncorrelated noises [13] It appears that noise can be removed from speech inputs prior to ADF separation. Enhancement, and phone recognition experiments were conducted, and the results are presented to show the performances of the proposed separation and enhancement techniques

ADF MODEL IN NOISE
NOISE COMPENSATION FOR ADF
Fast update of compensation terms
Fast ADF and NC-FADF
Tracking of ADF output noise autocorrelations
Enhancement of separated speech
COMPLEXITY ANALYSIS
Experimental data and setup
Speech separation performance
Speech enhancement and phone recognition
Sensitivity to noise estimation
Findings
CONCLUSIONS AND FUTURE WORK

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.