Abstract
An understanding of the contribution of different spectral bands to perceptual information in speech is necessary to design objective measures for the prediction of speech intelligibility. Existing band-weighting functions (BWFs) perform well in the prediction of the intelligibility of noise- masked speech. However, there is a lack of suitable BWF for noise-suppressed speech, which suffers from nonlinear distortions introduced by noise-suppression processing. This work introduces a new data-driven BWF for improving the intelligibility prediction performance of the frequency-weighted segmental signal-to-noise ratio (SNR) (fwSNRseg) measure. The data-driven BWF is designed by assigning weights to each band using band-specific correlation coefficients between accumulated segmental SNRs and speech intelligibility scores from a training dataset. When evaluated using noise-suppressed speech samples processed by eight different noise-suppression algorithms, the proposed data-driven BWF yielded significantly better speech intelligibility prediction performance than the articulation index weight.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have