Abstract

AbstractVietnamese spelling error detection and correction is a crucial task in Natural language processing, it plays an important role in many different real-world applications. Although there is a lot of research on it, dealing with diverse types of errors in Vietnamese is still a challenge. In this paper, we propose a model to help detect and correct some specific Vietnamese spelling errors by combining a pre-trained neural network-based Vietnamese language model and N-gram language model. We also provide a clear definition of handleable error types, error generation rules in the training set and evaluate our proposed model on a Vietnamese benchmark dataset at the word level. The experimental results show that our model achieves higher than from 1% to 14% f1-score than other neural network-based pre-trained language models in detection and make comparisons with bi, tri and 4-g language models to choose the best model for correction.KeywordsVietnamese spell correctionError detectionDeep learningLanguage model

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.