Abstract

The problem of cursive script segmentation is an essential one for handwritten character recognition. This is specially true for Arabic text where cursive is the only mode even for typewritten font. In this paper, we present a generalized segmentation approach for handwritten Arabic cursive scripts. The proposed approach is based on the analysis of the upper and lower contours of the word. The algorithm searchers for local minima points along the upper contour and local maxima points along the lower contour of the word. These points are then marked as potential letter boundaries (PLB). A set of rules, based on the nature of Arabic cursive scripts, are then applied to both upper and lower PLB points to eliminate some of the improper ones. A matching process between upper and lower PLBs is then performed in order to obtain the minimum number of non-overlapping PLB for each word. The output of the proposed segmentation algorithm is a set of labeled primitives that represent the Arabic word. In order to reconstruct the original word from its corresponding primitives and diacritics, a novel binding and dot assignment algorithm is introduced. The algorithm achieved correct segmentation rate of 97.7% when tested on samples of loosely constrained handwritten cursive script words consisting of 7922 characters written by 14 different writers.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call