Abstract
A new computational approach is being investigated for determining the intelligibility of speech subjected to a waveform degradation or signal-processing transformation. The approach employs a model of the human auditory system which reduces a speech waveform to a sequence of discrete symbols, each representing a prototypical vector of parameter values measured in a single frame (10 ms) of speech. The perceptual effect of the transformation is estimated by assessing the consistency between the symbol sequence derived from an untransformed (input) speech signal, and that derived from a transformed (output) signal. This is implemented via calculation of percent transmitted information. Degradations studied thus far include linear filtering and additive white noise. Using parameters that represent only band energies, transmitted information is halved for 2.5-kHz highpass and 3-kHz lowpass filtering. By contrast, the effect of noise is much more severe: halving is observed at approximately +30 dB SNR. Efforts are currently underway to improve the model’s behavior, particularly in noise. [Work supported by NIH.]
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.