Abstract

In information appliances based on speech recognition, users’ spoken queries are converted into text queries using automatic speech recognition (ASR) engines. If the top-1 results of the ASR engines are incorrect, these errors are propagated to the following natural language processing steps. To alleviate this error propagation problem, we propose a post-processing model for revising ASR errors. Based on a sequence-to-sequence neural network, the proposed model generates a correct sentence from multiple candidate sentences returned by an ASR engine. The proposed model does not require any external resources or feature engineering effort, because it uses only syllables as input features. In our experiments with a Korean spoken chatting and FAQ corpus, the proposed model outperformed the previous models.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call