Abstract

An accurate Ideal Binary Mask (IBM) estimate is essential for Missing Feature Theory (MFT)-based speaker identifica-tion, as incorrectly labelled spectral components (where a com-ponent is either reliable or unreliable) will degrade the perfor-mance of an Automatic Speaker Identification (ASI) system ad-versely in the presence of noise. In this work a Bidirectional Re-current Neural Network (BRNN) with Long-Short Term Mem-ory (LSTM) cells is proposed for improved IBM estimation. The proposed system had an average IBM estimate accuracy improvement of 4.5% and an average MFT-based speaker iden-tification accuracy improvement of 3.1% over all tested SNRdB levels, when compared to the previously proposed Multilayer Perceptron (MLP)-IBM estimator. When used for speech en-hancement the proposed system had an average MOS-LQO (ob-jective quality measure) improvement of 0.32 and an average QSTI (objective intelligibility measure) improvement of 0.01 over all tested SNRdB levels, when compared to the MLP-IBM estimator. The results presented in this work highlight the effec-tiveness of the proposed BRNN-IBM estimator for MFT-based speaker identification and IBM-based speech enhancement.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.