Abstract

OCR is the system that works in the domain of Natural Language Processing and Image Processing. This system is used to convert all the text information that is present in image form to text format. For OCR, identification of the text in printed, handwritten and degraded document images is a challenging task due to the high inter/intra-variation between the background and the foreground of document image. These degraded documents can be historical, secrete message, or anything that have some value attachment with it. So to find the text information becomes the most critical issue. Degradation of documents can be because of long time period, information hiding purpose, varying types of image noises etc. We have to face even more difficulty, when text present in the document images is degraded or overlapped in terms of some characters or text lines. To Segment the text presented at the word level, into characters becomes one of the important challenges in optical character recognition because of the presence of touching or broken characters. Touching or broken characters can't be separated so easily from each other. This paper is focused on finding/applying an efficient method and also discusses some of the solutions based techniques for segmentation of touching characters in Indian Languages. This paper also has the proposed frame work to use these solutions to get maximum benefits. Proposed work of recognition of Overlapping Characters in Document Image is primarily for the Indian Languages.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.