Abstract

Permutation ambiguity is a crucial issue for deep learning based talker-independent speaker separation. Deep clustering and permutation invariant training (PIT) have been widely used to address the permutation ambiguity problem in monaural scenarios. Although both approaches have been extended to multi-microphone scenarios, we believe that the permutation ambiguity problem can be naturally avoided by leveraging the spatial relations of multiple speakers. In this study, we present location-based training (LBT), a new approach to achieve talker independency in multi-channel speaker separation. Unlike PIT that examines all possible permutations, LBT assigns speakers according to their positions in physical space. With a linear training complexity to the number of concurrent speakers, LBT is computationally much more efficient than PIT with a factorial complexity, particularly when a large number of overlapping speakers needs to be separated. Specifically, we propose two training criteria: azimuth-based and distance-based training, using speaker azimuths and distances relative to a microphone array, respectively. Evaluation results show that LBT significantly outperforms PIT on two-speaker and three-speaker mixtures with different array geometries and in various acoustic conditions. In addition, we propose a joint training strategy to integrate azimuth-based and distance-based training, which further improves separation performance.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.