Abstract

Using image signature in the pattern recognition and classification applications is very popular among the researchers because this approach provides very fast yet accurate result in most of the problems. This paper presents a fast multistage algorithm for the recognition of printed isolated character set of any script especially for non-Latin like Arabic and Urdu scripts. The proposed method is an improvement over the previously proposed approach that uses size normalized binary images of isolated characters and extracts their signature based on the number of black pixels and then it is used for the recognition of that character set. Same approach for signature extraction is used with the addition of signature scaling factor and some other modifications. Confusion matrices for Urdu, Arabic and English language character sets are built using standard template matching, previous approach and improved technique and it is observed that on average more than 10% decrease in matching between any pair of characters of same set is achieved which ensures greater probability of correct recognition results even in the presence of noisy patterns. Algorithms have been developed to make the computational comparison between the two approaches. Although initially the algorithm is implemented for three different character sets Urdu, Arabic and English - but it is discussed in the implementation that by just providing the character set, the proposed technique can be applied to any of the Latin or non-Latin script

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.