Abstract
With mobile phone penetration high and growing rapidly, speech based access to information is an attractive proposition. However, automatic speech recognition(ASR) performance is seriously compromised in real-world scenarios where background acoustic noise is omnipresent. Speech enhancement methods can help to improve the signal quality presented to the automatic speech recognition at the receiving end. These methods typically exploit spectral diversity to achieve separation of speech from noise. While this works for most background noise, it fails for noise arising from speech sources such as interfering speakers in the vicinity of the caller. In this paper, we investigate the potential advantages of generating spatial cues via stereo microphones on the mobile phone handset to enhance speech. Such, enhancement of foreground speech can be done using blind source separation (BSS). This, when applied to the stereo mixtures before transmission is shown to achieve a significant improvement in ASR accuracy in the context of a mobile-phone based agricultural information access system.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.