Abstract

Significant improvements in intelligibility of speech in noise can be obtained by modifying the speech signal in the time and/or frequency domains. However, most speech intelligibility enhancement algorithms are designed to use clean speech as an input, and their performance suffers once the input speech signal-to-noise ratio decreases, a common case in face-to-face communication environments such as restaurants or cafes. In this work we investigate whether a particularly successful speech intelligibility enhancement system—spectral shaping and dynamic range compression—and various front-end noise reduction methods might be suitable in such environments. Our evaluations suggest that such a complete system would provide an increase in speech intelligibility equivalent to a gain of 10 dB input signal-to-noise ratio in the more challenging face-to-face communication environments.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call