Abstract

The authors describe a method for the recognition of cursively handwritten words using hidden Markov models (HMMs). The modelling methodology used has previously been successfully applied to the recognition of both degraded machine-printed text and hand-printed numerals. A novel lexicon-driven level building (LDLB) algorithm is proposed, which incorporates a lexicon directly within the search procedure and maintains a list of plausible match sequences at each stage of the search, rather than decoding using only the most likely state sequence. A word recognition rate of 93.4% is achieved using a 713 word lexicon, compared to just 49.8% when the same lexicon is used to post-process the results produced by a standard level building algorithm. Various procedures are described for the normalisation of cursive script. Results are presented on a single-author database of scanned text. It is shown how very high reliability, up to near perfect recognition, can be achieved by using a threshold to reject those word hypotheses to which the system assigns a low confidence. At 19% rejection, 99.2% of accepted words appeared in the top two choices produced by the system, and 100% of the 1645 accepted words were correctly recognised within the top eight choices.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call