An LVCSR Based Reading Miscue Detection System Using Knowledge of Reference and Error Patterns

Changliang Liu,Fuping Pan,Fengpei Ge,Bin Dong,Hongbin Suo,Yonghong Yan

doi:10.1587/transinf.e92.d.1716

Changliang Liu, Fuping Pan + Show 4 more

Open Access

https://doi.org/10.1587/transinf.e92.d.1716

Copy DOI

Journal: IEICE Transactions on Information and Systems	Publication Date: Jan 1, 2009
Citations: 2	License type: free

Abstract

This paper describes a reading miscue detection system based on the conventional Large Vocabulary Continuous Speech Recognition (LVCSR) framework [1]. In order to incorporate the knowledge of reference (what the reader ought to read) and some error patterns into the decoding process, two methods are proposed: Dynamic Multiple Pronunciation Incorporation (DMPI) and Dynamic Interpolation of Language Model (DILM). DMPI dynamically adds some pronunciation variations into the search space to predict reading substitutions and insertions. To resolve the conflict between the coverage of error predications and the perplexity of the search space, only the pronunciation variants related to the reference are added. DILM dynamically interpolates the general language model based on the analysis of the reference and so keeps the active paths of decoding relatively near the reference. It makes the recognition more accurate, which further improves the detection performance. At the final stage of detection, an improved dynamic program (DP) is used to align the confusion network (CN) from speech recognition and the reference to generate the detecting result. The experimental results show that the proposed two methods can decrease the Equal Error Rate (EER) by 14% relatively, from 46.4% to 39.8%.

Full Text