Abstract

Spell Correction is a widely referred problem in natural language processing. Since spelling errors prevent perfect transmission of the author’s intended concepts to the audience, writers and researchers usually spend a lot of time reviewing, to detect and correct spelling errors in their writings. Therefore, an automatic tool would be a great help in this area. Persian language is very prone to these errors due to its unique features. Also, the introduction of Arabic sentences and terms into this language has increased the challenges in spelling correction. Thus, there is a need for a tool that can detect and correct spelling errors in bilingual Persian and Arabic content. In this work, a supervised deep learning-based approach is proposed which benefits from a conditional random field (CRF) recurrent neural network to correct bilingual Arabic and Persian spelling errors. In order to create a suitable data set for training and testing the model presented in the proposed approach, 220,000 sentences with Arabic and Persian content were taken. Then artificially and using the methods of producing correct and error pairs, spelling errors were generated. In the next step, using the neural network based on the conditional random field, a model was presented that takes the features extracted from the data set as input to the network and makes predictions. The design of these features is one of the important points in this type of implementation. The results of the evaluations show that the proposed approach has a good and acceptable accuracy as the first bilingual Arabic and Persian Spell Corrector.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.