Abstract

Several filterbank-based metrics have been proposed to predict speech intelligibility (SI). However, these metrics incorporate little knowledge of the auditory periphery. Neurogram-based metrics provide an alternative, incorporating knowledge of the physiology of hearing by using a mathematical model of the auditory nerve response. In this work, SI was assessed utilizing different filterbank-based metrics (the speech intelligibility index and the speech-based envelope power spectrum model) and neurogram-based metrics, using the biologically inspired model of the auditory nerve proposed by Zilany, Bruce, Nelson, and Carney [(2009), J. Acoust. Soc. Am. 126(5), 2390-2412] as a front-end and the neurogram similarity metric and spectro temporal modulation index as a back-end. Then, the correlations with behavioural scores were computed. Results showed that neurogram-based metrics representing the speech envelope showed higher correlations with the behavioural scores at a word level. At a per-phoneme level, it was found that phoneme transitions contribute to higher correlations between objective measures that use speech envelope information at the auditory periphery level and behavioural data. The presented framework could function as a useful tool for the validation and tuning of speech materials, as well as a benchmark for the development of speech processing algorithms.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.