Abstract

In integrated segmentation and recognition of character strings, the underlying classifier is trained to be resistant to noncharacters. We evaluate the performance of state-of-the-art pattern classifiers of this kind. First, we build a baseline numeral string recognition system with simple but effective presegmentation. The classification scores of the candidate patterns generated by presegmentation are combined to evaluate the segmentation paths and the optimal path is found using the beam search strategy. Three neural classifiers, two discriminative density models, and two support vector classifiers are evaluated. Each classifier has some variations depending on the training strategy: maximum likelihood, discriminative learning both with and without noncharacter samples. The string recognition performances are evaluated on the numeral string images of the NIST Special Database 19 and the zipcode images of the CEDAR CDROM-1. The results show that noncharacter training is crucial for neural classifiers and support vector classifiers, whereas, for the discriminative density models, the regularization of parameters is important. The string recognition results compare favorably to the best ones reported in the literature though we totally ignored the geometric context. The best results were obtained using a support vector classifier, but the neural classifiers and discriminative density models show better trade-off between accuracy and computational overhead.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call