Abstract

While automatic speech recognition (ASR) can work very well for clean speech, recognition accuracy often degrades significantly when the speech signal is subject to corruption, as occurs in many communication channels. This paper will survey recent methods for handling various distortions in practical ASR. The problem is often presented as an issue of mismatch between the models that are created during prior training phases and unforeseen environmental acoustic conditions that occur during the normal test phase. As one can never anticipate all possible future conditions, ASR analysis must be able to adapt to a wide variety of distortions. Human listeners furnish a useful standard of comparison for ASR in that humans are much more flexible in handling unexpected acoustic distortions than current ASR is. Methods that adapt ASR features and models will be compared against ASR methods that enhance the noisy input speech. Other topics to be discussed will include estimation of noise and channel parameters, RAS...

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.