Abstract

Recently, supervised speaker-independent speech separation methods, such as deep clustering and permutation invariant training, have demonstrated better performance than conventional unsupervised speech separation methods. However, their performance drops sharply in reverberant environments. To solve the problem, we propose a multi-channel speech separation algorithm that fully explores spatial information. It first extracts a spatial feature, named interaural phase difference (IPD), as one of the input features of the single-channel deep clustering algorithm. Then, it uses the deep clustering as the noise estimation component of the deep-learning-based beamforming. The novelty of the proposed algorithm lies in that it extends the spatial-feature-based deep clustering to a multichannel algorithm which boosts the performance by exploring spatial information at both the input and output of deep clustering. Its advantages have two aspects. First, the spatial feature IPD significantly improves the robustness of deep clustering in reverberant environments. Second, the deep-clusteing-based beamforming, which is a linear algorithm, suffers less nonlinear distortions than the single-channel deep clustering. We have compared the proposed algorithm with the single-channel deep clustering algorithm, spatial-feature-based multi-channel deep clustering with IPD, and deep-clustering-based beamforming without IPD in reverberant environments. Experimental results show that the proposed algorithm performs significantly better than the comparison methods.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.