Abstract

In this paper, an error correction method is proposed to improve the performance of dysarthric automatic speech recognition (ASR). To this end, context-dependent pronunciation variations are modelled by using a weighted Kullback-Leibler (KL) distance between acoustic models of ASR. Then, the context-dependent pronunciation variation model is converted into a weighted finite-state transducer (WFST) and combined with a lexicon and a language model. It is shown from ASR experiments that average word error rate (WER) of a WFST-based ASR system with the proposed error correction method is relatively reduced by 19.73%, compared to an ASR system without error correction. Moreover, it is shown that the error correction method using a weighted KL distance relatively reduces average WER by 3.81%, compared to that using a KL distance.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call