Abstract

The performance of the far-field speaker identification (SI) system is usually reduced by the well-known mismatch problem imposed by environmental conditions. Speech enhancement methods are known as convenient ways of resolving the mismatches created by additive noise and reverberations. Human auditory capability for identifying and segregating sounds of speakers in complex environmental conditions motivates researchers to employ known aspects of binaural hearing in speech separation and enhancement methods. This paper proposes a solution to the mismatch problem by employing binaural speech separation methods as front-end processing in the i-vector-based speaker identification. Here, the speech separation approaches utilize binaural masks in their structure to improve the performance of the SI systems by enhancing mixture signals in realistic environmental conditions. For this purpose, two binaural masks, namely, model-based expectation-maximization interaural coherence mask (MEICM) and a recently-introduced DNN-based mask, are employed in the framework of the proposed SI structure. To evaluate the new binaural SI structure, an experiment is conducted which examines various ratio masks in the i-vector-based speaker identification with diffused multi-talker babble noise and reverberation. The simulation results show that employing the DNN-based ratio mask in the binaural speech separation front-end achieves the highest identification performance among other mask estimation methods.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.