Abstract
This paper presents a recognition-based character segmentation method for handwritten Chinese characters. Possible non-linear segmentation paths are initially located using a probabilistic Viterbi algorithm. Candidate segmentation paths are determined by verifying overlapping paths, between-character gaps, and adjacent-path distances. A segmentation graph is then constructed using candidate paths to represent nodes and two nodes with appropriate distances are connected by an arc. The cost in each arc is a function of character recognition distances, squareness of characters and internal gaps in characters. After the shortest path is detected from the segmentation graph, the nodes in the path represent optimal segmentation paths. In addition, 125 text-line images are collected from seven form documents. Cumulatively, these text-lines contain 1132 handwritten Chinese characters. The average segmentation rate in our experiments is 95.58%. Moreover, the probabilistic Viterbi algorithm is modified slightly to extract text-lines from document pages by obtaining non-linear paths while gaps between text-lines are not obvious. This algorithm can also be modified to segment characters from printed text-line images by adjusting parameters used to represent costs of arcs in the segmentation graph.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.