Abstract

In this paper, a technique for the recognition of unconstrained Arabic printed text is proposed. Features that measure the image characteristics at local scales are applied. A line image is divided into a set of one-pixel width windows which is sliding a cross that text line. Run length encoding is used to extract features from each window. A unique method is chosen to select best number of transitions for each window. The proposed recognition system is trained and tested on the APTI (Arabic Printed Text Image) database. In order to select the optimal parameters for feature extraction and for the HMM classifier, the APTI training dataset is further divided into a smaller training subset and a verification set. The estimated parameters are, then, used in the testing phase. The presented technique provides state-of-the-art recognition results on the APTI database using HMMs. The achieved average recognition rates is 96.65% on the letter level using the HMM classifier.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.