Abstract

Unconstrained handwritten text recognition is one of the most difficult problems in the field of pattern recognition. This paper presents a new on-line Arabic handwriting recognition system based on hidden Markov models (HMMs). Besides the common use of the off-line features for the HMM-based Arabic recognition systems, we add the use of on-line features and combination of the two approaches. The delta and acceleration features are used as approximations to the derivatives of the observation vectors with respect to time, and they have proved to be very effective in improving the system’s performance. Delayed strokes are a well-known problem in on-line handwriting recognition due to its varying writing order among different writers. We solved this problem by removing those delayed strokes by using a new delayed strokes detection approach that makes use of the baseline information and the shape of the strokes. The baseline detection method used in our system is based on horizontal projection. Removing delayed strokes has also led to the ability to combine some of the Arabic characters that share the same primary stroke and are only distinguishable by their delayed strokes into one class (HMM model), which increased the recognition rate. A new algorithm for lexicon reduction based on the detection of the delayed strokes has been developed. The lexicon reduction algorithm is used to improve the system’s performance in terms of speed and recognition rate. The on-line database ADAB of Tunisian town names is used for system training and evaluation. We achieved recognition rates up to 97.5 %, which is very promising compared to the highest recognition rate achieved on this database and is significantly higher than the recognition rates achieved by the other HMM-based Arabic handwriting recognition systems.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call