Abstract
In this study, we propose a new concept, the gammachirp envelope distortion index (GEDI), based on the signal-to-distortion ratio in the auditory envelope, SDRenv, to predict the intelligibility of speech enhanced by nonlinear algorithms. The objective of GEDI is to calculate the distortion between enhanced and clean-speech representations in the domain of a temporal envelope extracted by the gammachirp auditory filterbank and modulation filterbank. We also extend GEDI with multi-resolution analysis (mr-GEDI) to predict the speech intelligibility of sounds under non-stationary noise conditions. We evaluate GEDI in terms of the speech intelligibility predictions of speech sounds enhanced by a classic spectral subtraction and a Wiener filtering method. The predictions are compared with human results for various signal-to-noise ratio conditions with additive pink and babble noises. The results showed that mr-GEDI predicted the intelligibility curves better than short-time objective intelligibility (STOI) measure, extended-STOI (ESTOI) measure, and hearing-aid speech perception index (HASPI) under pink-noise conditions, and better than HASPI under babble-noise conditions. The mr-GEDI method does not present an overestimation tendency and is considered a more conservative approach than STOI and ESTOI. Therefore, the evaluation with mr-GEDI may provide additional information in the development of speech enhancement algorithms.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.