Abstract

The purpose of this paper is to examine the use of spectral masking techniques as a preprocessing step in speech recognition systems. The limits of these masking techniques for different levels of the signal-to-noise ratio are discussed. In general, speech recognition systems have low performance in noisy environments or in the presence of other speech signals. This work presents a blind source separation system based on ideal binary masks to deal with real situations in which speech signals are corrupted by noise, including other speech signals. The main contribution of this work is to analyze the performance limits of recognition systems using spectral masking. We obtain an increase of 18 % on the speech hit rate and an average gain of 10 dB in terms of noise level attenuation, when the speech signals were corrupted by other voice signals, with different signal-to-noise ratio of approximately 1, 10 and 20 dB.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.