Abstract

This paper presents a novel geometrical scale and rotation independent feature extraction (FE) technique for multi-lingual character recognition (CR). The performance of any CR techniques mainly depends on the robustness of the proposed FE methods. Currently, there are very few scale and rotation independent FE techniques present in the literature which successfully extract the robust features from characters with noise such as distortion and breaks in the characters. Many FE methods from the literature failed to distinguish the characters which look similar in their appearance. So, in this paper, we have proposed a novel scale and rotation independent geometrical shape FE technique which successfully recognized distorted, broken, and similarly looking characters. Aside from the proposed FE technique, we've used crossing count (CC) features. Finally, we have combined the proposed features with CC features to make as Feature Vector (FV) of the character to be recognized. The proposed CR technique is evaluated using publicly available media-lab license plate (LP), ISI_Bengali, and Chars74K benchmark data sets and achieved encouraging results. To further assess the performance of the proposed FE method, we've used a proprietary data set containing nearly 168000 multi-lingual characters from English, Devanagari, and Marathi scripts and achieved encouraging results. We have observed better classification rates for the proposed FE method using publicly available benchmark data sets as compared to few of the CR FE methods from the literature.

Highlights

  • We the human beings have the beautiful ability to recognize the text present in all sorts of forms such as those printed in different font styles, handwritten, sloppy, and inclined, which are camouflaged with the background, possessing variations in illumination and brightness, of varying sizes, occluded ones, from various viewpoints, written upside down, having characters with missing parts, unwary decorations and marks, broken or even misspelled, having artistic and figurative designs, and many more

  • It comes as no surprise that the creative computer vision (CV) community despite six and more decades of intensive research could not achieve much in making computers represent images and perform well-defined white-box generic and low-level processes (ANNs are algorithms belonging to the category of the black box) on images thereby making computers capable of detecting and recognizing texts robustly in a way much similar to that of the humans along with many other activities performed by human visual cortex such as classification, and alike

  • We have considered 30000 characters extracted from 280 aged multi-lingual Indian documents having English, Marathi, and Devanagari scripts to assess the performance of the proposed algorithm and achieved a success rate of 98.8% which is almost equivalent the success rate of 98.5% reported in [15]

Read more

Summary

INTRODUCTION

We the human beings have the beautiful ability to recognize the text present in all sorts of forms such as those printed in different font styles, handwritten, sloppy, and inclined, which are camouflaged with the background, possessing variations in illumination and brightness, of varying sizes, occluded ones, from various viewpoints, written upside down, having characters with missing parts, unwary decorations and marks, broken or even misspelled, having artistic and figurative designs, and many more. Working on the similar lines there is literature which takes heed from physical and biological concepts in order to describe the image using its features, different features being responsible to obtain robustness from different kinds of variations and thereby incorporating as many features as possible gives less space to the algorithms to make mistakes and makes the FV generic. FE techniques reduce the dimensionality of the image to be recognized and thereby making the recognition process computationally efficient and mathematically feasible. These features are checked for similarities with an abstract vector representing a character.

RELATED WORK
PROPOSED WORK
Scale and Rotation Independent Feature Extraction Generation
Crossing Count Features Generation
EXPERIMENTAL RESULTS AND DISCUSSION
CONCLUSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call