Abstract

The effort required to listen to and understand noisy speech is an important factor in the evaluation of noise reduction schemes. This paper introduces a model for Listening Effort prediction from Acoustic Parameters (LEAP). The model is based on methods from automatic speech recognition, specifically on performance measures that quantify the degradation of phoneme posteriorgrams produced by a deep neural net: Noise or artifacts introduced by speech enhancement often result in a temporal smearing of phoneme representations, which is measured by comparison of phoneme vectors. This procedure does not require a priori knowledge about the processed speech, and is therefore single-ended. The proposed model was evaluated using three datasets of noisy speech signals with listening effort ratings obtained from normal hearing and hearing impaired subjects. The prediction quality was compared to several baseline models such as the ITU-T standard P.563 for single-ended speech quality assessment, the American National Standard ANIQUE+ for single-ended speech quality assessment, and a single-ended SNR estimator. In all three datasets, the proposed new model achieved clearly better prediction accuracies than the baseline models; correlations with subjective ratings were above 0.9. So far, the model is trained on the specific noise types used in the evaluation. Future work will be concerned with overcoming this limitation by training the model on a variety of different noise types in a multi-condition way in order to make it generalize to unknown noise types.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.