Abstract
Off-line recognition of handwritten Chinese characters is of considerable practical importance as well as representing a very hard pattern recognition problem. A popular approach is to decompose characters into their component or ‘primitive’ parts – most usually strokes. Here, however, we take the less usual approach of decomposing into radicals. Active shape modelling is applied and developed into active radical modelling. In training, 60 examples of each radical are represented by ‘landmark’ points, labelled semi-automatically, with radicals in different characteristic positions treated as distinctly different radicals. Principal component analysis then captures the main variation around the mean radical. In recognition, the dynamic tunnelling algorithm is incorporated with gradient descent to search for optimal shape parameters in terms of chamfer distance minimisation. Although prior landmark labelling is time-consuming and gradient descent search during recognition is computationally expensive, the method is theoretically well motivated, incorporates prior knowledge about the structure of Chinese characters in an appropriate way, and avoids problems implicit in stroke extraction. Experiments are conducted on 280,000 loosely constrained characters from 200 writers. There are 98 different categories of radical included in 1400 character categories, and approximately 590,000 radicals in total. The matching rate on this large test set is 94.2% radicals correct (writer-independent), greatly superior to existing radical approaches. Assuming character composition to be a Markov process in which up to four radicals are combined in some assumed sequential order, we can recognise complete, hierarchically composed characters using the Viterbi algorithm. This results in a character recognition rate of 92.6%.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.