Abstract

The performance of speech enhancement algorithms deteriorates rapidly with decreasing signal-to-noise ratio (SNR). At a low SNR, high-intensity phonemes such as vowels are therefore more likely to be enhanced than low-intensity speech segments such as many consonants. Although the selective enhancement of vowels enhances transitional cues for consonant recognition, it simultaneously degrades relative amplitude cues. Experiments with normal-hearing subjects were performed to determine the overall effect of selective enhancement of vowels on the intelligibility of consonants in consonant–vowel–consonant utterances. In quiet, a 12-dB enhancement of the vowels did not significantly reduce consonant intelligibility compared with an unenhanced control condition at 65 dB (A). When unenhanced utterances were presented in background noise with an average SNR of −6 dB at the vowel segments, 50.1% of the consonants were correctly identified while 69.8% of consonants were recognised in a condition where the consonant SNR remained unchanged but where the vowels were selectively amplified by 12 dB. Equal enhancement of the vowels and consonants by 12 dB, however, led to 91.5% consonant recognition. We conclude that speech enhancement algorithms should enhance all speech segments to the greatest possible extent, even if this leads to selective enhancement of some phoneme categories over others.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call