Abstract

In this paper, an error correction method is proposed to improve the performance of dysarthric automatic speech recognition (ASR). To this end, context-dependent pronunciation variations are modelled by using a weighted Kullback-Leibler (KL) distance between acoustic models of ASR. Then, the context-dependent pronunciation variation model is converted into a weighted finite-state transducer (WFST) and combined with a lexicon and a language model. It is shown from ASR experiments that average word error rate (WER) of a WFST-based ASR system with the proposed error correction method is relatively reduced by 19.73%, compared to an ASR system without error correction. Moreover, it is shown that the error correction method using a weighted KL distance relatively reduces average WER by 3.81%, compared to that using a KL distance.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.