Abstract

With an increasing commercial demand for speech interfaces to be integrated into language technology, many technologists have made an unfortunate discovery: combining existing automatic speech recognition (ASR) and natural language processing (NLP) systems often leads to disappointing results. This talk will discuss two factors that contribute to this disparity and make some general suggestions for language technologists and researchers looking to work with tem. The first is the greater degree of variation in speech than text (at least in languages like English) which can lead to higher error rates overall. The second is a mismatch in domain. Modern machine learning approaches to language technology are very sensitive to differences between datasets and (due in part to the disciplinary division between researchers working on language technology for speech and text) most NLP applications have not been trained on speech data.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.