Abstract

Detection of automatic speech recognition (ASR) errors is crucial to preventing their further propagation through statistical machine translation (SMT) in conversational spoken language translation (CSLT) systems. In this paper, we venture beyond traditional features obtained from the ASR decoder and hypothesized word sequence, and explore additional information streams provided by an error-robust CSLT system, including SMT confidence estimates and posteriors from named entity detection (NED). Another significant novelty of this work is the use of an automated word boundary detector based on acoustic-prosodic features to verify the existence of ASR-hypothesized word boundaries, which further improves ASR error detection. Offline evaluation on a test set designed to invoke ASR errors showed that at 10% false alarm rate, the proposed features provide 2.8% absolute (4.2% relative) improvement in detection rate over a state-of-the-art baseline error detector that uses a rich set of features traditionally employed in the existing literature.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call