Abstract

SUMMARY The paper describes how a robust and compact on-line handwritten Japanese text recognizer was developed by compressing each component of an integrated text recognition system including a SVM classifier to evaluate segmentation points, an on-line and off-line combined character recognizer, a linguistic context processor, and a geometric context evaluation module to deploy it on hand-held devices. Selecting an elasticmatching based on-line recognizer and compressing MQDF2 via a combination of LDA, vector quantization and data type transformation, have contributed to building a remarkably small yet robust recognizer. The compact text recognizer covering 7,097 character classes just requires about 15MB memory to keep 93.11% accuracy on horizontal text lines extracted from the TUAT Kondate database. Compared with the original full-scale Japanese text recognizer, the memory size is reduced from 64.1MB to 14.9MB while the accuracy loss is only 0.5% from 93.6% to 93.11%. The method is scalable so even systems of less than 11MB or less than 6MB still remain 92.80% or 90.02% accuracy, respectively.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call