Abstract

Two novel speech enhancement algorithms are presented that automatically increase intelligibility in noisy environments while maintaining the signal power and naturalness of the original speech. These energy redistribution (ER) algorithms move signal energy to targeted regions of relatively high information content that are crucial for intelligibility. The boosted regions are originally of low energy and therefore usually the first segments lost with the addition of environmental noise. The ER voiced/unvoiced (ERVU) method transfers energy from voiced speech to regions of unvoiced speech, while the ER spectral transition (ERST) method moves energy from spectrally stationary regions to spectrally transitional regions. Hand-held cell phones and public address systems are expected to be the dominant applications for these techniques. Standard noise reduction methods such as spectral subtraction are assumed to have already been applied to the voice signal before broadcast. Using human listening tests, it was found that both algorithms boost the intelligibility of speech in noisy environments by nearly 7% over the original unprocessed signals, without degrading naturalness or increasing signal power. Furthermore, both algorithms allow for controlling the trade-off between boost gain and speech naturalness.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.