Abstract

Clipping is often observed in speech acquisition, due to the limited numerical range or the non-linear compensation of recording devices. The clipping inevitably changes the spectrum of speech signals, and thus partially distorts the speaker information contained in the signal. This paper investigates the impact of signal clipping on speaker recognition, and proposes a simple yet effective clipping detection approach as well as a signal reconstruction approach based on deep neural networks (DNNs). The experiments are conducted on the core test of the NIST SRE2008 task by simulating clipped speech at various clipping rates. The results show that clipping does impact the performance of speaker recognition, but the impact is rather marginal unless the clipping rate is larger than 80%. We also find that the simple distribution-based detection method is capable of detecting clipped speech with a higher accuracy, and the DNN-based reconstruction can achieve promising performance gains for speaker recognition on clipped speech.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call