Abstract
AbstractCharacter segmentation is a technique which separates individual characters from character line image. It is one of the most important prerequisites for character recognition. In the past, segmentation of individual characters from a general object, such as mailing address and existing document, had strong constraints imposed on the character segmentation. These included contact between adjacent characters and separation of a single character, thereby preventing the segmentation technique from being a systematic approach. This paper discusses character segmentation, an indispensable means of pre‐processing in character recognition which has been considered to cope with the individual cases. A character segmentation method is proposed which is based on the clustering for the character cluster interval histogram by linear square‐error function, and on the dynamic programming using the minimum variance criterion for separation between character sectioning candidate positions in a line image. The first method is applied to the estimation of the character pitch, i.e., estimating the statistically best character pitch. It also extracts the parameters representing the placement properties for a series of characters. The second method is used to determine in a stable way a series of character sectioning positions. In the experiment, the method is applied to English language mail addresses, containing fixed and unspecified character pitches as well as contact between adjacent characters. A 99.2% correct segmentation rate was obtained for characters and 98.0% was obtained for words, indicating the effectiveness of the method.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.