Abstract

Regional and cultural diversities around the world and especially in Pakistan has given birth to a large number of writing systems and scripts consists of varying character sets. Developing an optimal OCR system for such a varying and large character set is a challenging task. Unlimited variations in handwritten texts due to mood swings, varying writing styles, changes in medium of writing, etc. puzzles the research community. Slight change in character shapes for various scripts acts as a big barrier in developing the character recognition (CR) systems for cursive scripts. Unavailability of benchmark results and corpora for cursive scripts CR impedes the researchers in the development of an optimal CR systems. To efficiently address these issues, the proposed research work aims to develop an optimum OCR system for the recognition of handwritten Pashto characters. Also the unavailability of a standard corpora of the handwritten Pashto characters is addressed by developing a medium sized corpus of the handwritten Pashto characters (14784 handwritten samples). K nearest neighbor is adapted for the recognition of the Pashto characters based on the zoning technique. After testing the proposed OCR system for varying training and test sets an overall accuracy of 85.31% is calculated.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.