Why ASR + NLP isn't enough for commercial language technology

Rachael Tatman

doi:10.1121/10.0008537

Abstract

With an increasing commercial demand for speech interfaces to be integrated into language technology, many technologists have made an unfortunate discovery: combining existing automatic speech recognition (ASR) and natural language processing (NLP) systems often leads to disappointing results. This talk will discuss two factors that contribute to this disparity and make some general suggestions for language technologists and researchers looking to work with tem. The first is the greater degree of variation in speech than text (at least in languages like English) which can lead to higher error rates overall. The second is a mismatch in domain. Modern machine learning approaches to language technology are very sensitive to differences between datasets and (due in part to the disciplinary division between researchers working on language technology for speech and text) most NLP applications have not been trained on speech data.

Full Text