Abstract
A dereverberation method based on generalized spectral subtraction (GSS) using a multi-channel least mean square (MCLMS) approach achieved significantly improved results on speech recognition experiments compared with conventional methods. In this study, we employ this method for hands-free speaker identification. The GSS-based dereverberation method using clean speech models degrades speaker identification performance, although it is very effective for speech recognition. One reason may be that the GSS-based dereverberation method causes distortion such as distortion characteristics between clean speech and dereverberant speech. In this study, we address this problem by training speaker models using dereverberant speech, which is obtained by suppressing reverberation from arbitrary artificial reverberant speech. We also propose a method that combines various compensation parameter sets to improve speaker identification and provide an efficient computational method. The speaker identification experiment was performed on large-scale farfield speech, with reverberant environments different to the training environments. The proposed method achieved a relative error reduction of 87.5%, compared with conventional cepstral mean normalization with beamforming using clean speech models, and 44.8% compared with reverberant speech models.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.