Automatic Stochastic Arabic Spelling Correction With Emphasis on Space Insertions and Deletions

Mohamed I Alkanhal,Mohamed A Al-Badrashiny,Mansour M Alghamdi,Abdulaziz O Al-Qabbany

doi:10.1109/tasl.2012.2197612

Abstract

This paper presents a stochastic-based approach for misspelling correction of Arabic text. In this approach, a context-based two-layer system is utilized to automatically correct misspelled words in large datasets. The first layer produces a list in which possible alternatives for each misspelled word are ranked using the Damerau-Levenshtein edit distance. The same layer also considers merged and split words resulting from deletion and insertion of space character. The right alternative for each misspelled word is stochastically selected based on the maximum marginal probability via A* lattice search and m-gram probability estimation. A large dataset was utilized to build and test the system. The testing results show that as we increase the size of the training set, the performance improves reaching 97.9% of F <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">1</sub> score for detection and 92.3% of F <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">1</sub> score for correction.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Automatic Stochastic Arabic Spelling Correction With Emphasis on Space Insertions and Deletions

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Audio, Speech, and Language Processing

Lead the way for us

Journal: IEEE Transactions on Audio, Speech, and Language Processing	Publication Date: Sep 1, 2012
Citations: 82

Similar Papers

Automatic Spelling Detection and Correction in the Medical Domain: A Systematic Literature Review
Jésica López-Hernández ... Ángela Almela
-
Jésica López-Hernández, et. al.Jésica López-Hernández ... Ángela Almela
01 Jan 2019
01 Jan 2019

Improved chemical text mining of patents using infinite dictionaries, translation and automatic spelling correction
Roger A Sayle ... Plamen Petrov
Journal of Cheminformatics | VOL. 3
Roger A Sayle, et. al.Roger A Sayle ... Plamen Petrov
19 Apr 2011
Journal of Cheminformatics | VOL. 3

Automatic Spelling Correction for Resource-Scarce Languages using Deep Learning
Pravallika Etoori ... Radhika Mamidi
-
Pravallika Etoori, et. al.Pravallika Etoori ... Radhika Mamidi
01 Jan 2018
01 Jan 2018

Automatic Chinese Spelling Checking and Correction Based on Character-Based Pre-trained Contextual Representations
Haihua Xie ... Zhiyou Chen
-
Haihua Xie, et. al.Haihua Xie ... Zhiyou Chen
01 Jan 2019
01 Jan 2019

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Automatic Stochastic Arabic Spelling Correction With Emphasis on Space Insertions and Deletions

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Audio, Speech, and Language Processing