Abstract

In this study, a novel technique that recovers the temporal structure of speech power spectrum is proposed. The histogram of average speech log power spectrum shows that the contamination of noise leads to the shift of noise peak, which in return degrades the performance of speech recognition systems. A two-step scheme is proposed to weaken the noise effects by first reducing the noise variance and then shifting the noise mean. The proposed algorithm consists of two parts, two-dimensional smoothing and controlled noise subtraction, which leads to the name SNS. The proposed algorithm manages to solve the speech probability distribution function discontinuity problem caused by traditional spectral subtraction series algorithms. In contrast to the clean speech estimation methods, the proposed algorithm does not need a prior speech/noise statistical model, which makes it simple but effective. The effectiveness of the proposed filter is tested using the AURORA2 database. Very promising results are obtained, 88.59% for noisy speech (average from signal-to-noise ratio 0-20 dB). Comparison is made against eight state-of-the-art speech recognition algorithms. Overall the proposed algorithm produces significant improvements over the comparison targets.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.