Abstract

An extension of the noise gating method for speech enhancement in crowded social environments was investigated. Parallel noise gating involves processing the same sound stream through noise gating several times before averaging to obtain the output. A voice activity detection module is used to check for continuity in pitch and formant frequency to help identify the target speaker. The output shows reduced signal distortion and digital artifact and was verified through both objective and subjective tests. A listening test involving 10 subjects in a low-context Speech Perception in Noise task with crowd noise mixed in at 0 dB SNR yielded an average word recognition accuracy of 87% compared to 56% for the standard implementation of noise gating and 48% for the original noisy signal. Similar results were found for 5 and 10 dB SNR although the improvements were less dramatic. In all cases, parallel noise gating scored significantly higher in terms of intelligibility and clarity when compared to the standard implementation of noise gating as well as the original noisy signal. The algorithm has the computational speed to allow for real time processing and can be easily adapted to work with other speech separation methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call