Abstract

When processing real-world recordings of speech, it is highly probable noise will be present at some instance in the signal. Compounding this problem is the situation when the noise occurs in short, impulsive bursts at random intervals. Traditional signal processing methods used to detect speech rely on the spectral energy of the incoming signal to make a determination whether or not a segment of the signal contains speech. However when noise is present, this simple energy detection is prone to falsely flagging noise as speech. This paper will demonstrate an alternative way of processing a noisy speech signal utilizing a combination of information theoretic and signal processing principles to differentiate speech segments from noise. The utilization of this preprocessing technique will allow a speaker recognition system to train statistical speaker model using noise-corrupted speech files, and construct models statistically similar to those constructed from noise-free data. This preprocessing method will be shown to outperform traditional spectrum-based methods for both low-entropy and high-entropy noise in low signal-to-noise ratio environments, with a reduction in the feature space distortion when measured using the Cauchy–Schwarz (CS) distance metric.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.