Speech-based Envelope Power Spectrum Model Research Articles

Four existing speech intelligibility models with different theoretical assumptions were used to predict previously published behavioural data. Those data showed that complex tones with pitch-related periodicity are far less effective maskers of speech than aperiodic noise. This so-called masker-periodicity benefit (MPB) far exceeded the fluctuating-masker benefit (FMB) obtained from slow masker envelope fluctuations. In contrast, the normal-hearing listeners hardly benefitted from periodicity in the target speech. All tested models consistently underestimated MPB and FMB, while most of them also overestimated the intelligibility of vocoded speech. To understand these shortcomings, the internal signal representations of the models were analysed in detail. The best-performing model, the correlation-based version of the speech-based envelope power spectrum model (sEPSMcorr), combined an auditory processing front end with a modulation filterbank and a correlation-based back end. This model was then modified to further improve the predictions. The resulting second version of the sEPSMcorr outperformed the original model with all tested maskers and accounted for about half the MPB, which can be attributed to reduced modulation masking caused by the periodic maskers. However, as the sEPSMcorr2 failed to account for the other half of the MPB, the results also indicate that future models should consider the contribution of pitch-related effects, such as enhanced stream segregation, to further improve their predictive power.

Diagnosing and treating hearing impairment is challenging because people with similar degrees of sensorineural hearing loss (SNHL) often have different speech-recognition abilities. The speech-based envelope power spectrum model (sEPSM) has demonstrated that the signal-to-noise ratio (SNRENV) from a modulation filter bank provides a robust speech-intelligibility measure across a wider range of degraded conditions than many long-standing models. In the sEPSM, noise (N) is assumed to: (a) reduce S + N envelope power by filling in dips within clean speech (S) and (b) introduce an envelope noise floor from intrinsic fluctuations in the noise itself. While the promise of SNRENV has been demonstrated for normal-hearing listeners, it has not been thoroughly extended to hearing-impaired listeners because of limited physiological knowledge of how SNHL affects speech-in-noise envelope coding relative to noise alone. Here, envelope coding to speech-in-noise stimuli was quantified from auditory-nerve model spike trains using shuffled correlograms, which were analyzed in the modulation-frequency domain to compute modulation-band estimates of neural SNRENV. Preliminary spike-train analyses show strong similarities to the sEPSM, demonstrating feasibility of neural SNRENV computations. Results suggest that individual differences can occur based on differential degrees of outer- and inner-hair-cell dysfunction in listeners currently diagnosed into the single audiological SNHL category. The predicted acoustic-SNR dependence in individual differences suggests that the SNR-dependent rate of susceptibility could be an important metric in diagnosing individual differences. Future measurements of the neural SNRENV in animal studies with various forms of SNHL will provide valuable insight for understanding individual differences in speech-in-noise intelligibility.

Speech-based Envelope Power Spectrum Model Research Articles

Articles published on Speech-based Envelope Power Spectrum Model

Predicting the effects of periodicity on the intelligibility of masked speech: An evaluation of different modelling approaches and their limitations.

The speech-based envelope power spectrum model (sEPSM) family: Development, achievements, and current challenges

Predicting phoneme and word recognition in noise using a computational model of the auditory periphery.

Predicting speech intelligibility based on a correlation metric in the envelope power spectrum domain.

Envelope and intensity based prediction of psychoacoustic masking and speech intelligibility.

Predicting binaural speech intelligibility using the signal-to-noise ratio in the envelope power spectrum domain.

Neural Spike-Train Analyses of the Speech-Based Envelope Power Spectrum Model

Effects of manipulating the signal-to-noise envelope power ratio on speech intelligibility.

A multi-resolution envelope-power based model for speech intelligibility

The role of high-frequency envelope fluctuations for speech masking release

The role of across-frequency envelope processing for speech intelligibility

Prediction of speech masking release for fluctuating interferers based on the envelope power signal-to-noise ratio

Predicting speech intelligibility based on the signal-to-noise envelope power ratio after modulation-frequency selective processing

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Speech-based Envelope Power Spectrum Model Research Articles

Articles published on Speech-based Envelope Power Spectrum Model

Predicting the effects of periodicity on the intelligibility of masked speech: An evaluation of different modelling approaches and their limitations.

The speech-based envelope power spectrum model (sEPSM) family: Development, achievements, and current challenges

Predicting phoneme and word recognition in noise using a computational model of the auditory periphery.

Predicting speech intelligibility based on a correlation metric in the envelope power spectrum domain.

Envelope and intensity based prediction of psychoacoustic masking and speech intelligibility.

Predicting binaural speech intelligibility using the signal-to-noise ratio in the envelope power spectrum domain.

Neural Spike-Train Analyses of the Speech-Based Envelope Power Spectrum Model

Effects of manipulating the signal-to-noise envelope power ratio on speech intelligibility.

A multi-resolution envelope-power based model for speech intelligibility

The role of high-frequency envelope fluctuations for speech masking release

The role of across-frequency envelope processing for speech intelligibility

Prediction of speech masking release for fluctuating interferers based on the envelope power signal-to-noise ratio

Predicting speech intelligibility based on the signal-to-noise envelope power ratio after modulation-frequency selective processing