Abstract

Key-Word Spotting (KWS) in handwritten documents is approached here by means of Word Graphs (WG) obtained using segmentation-free handwritten text recognition technology based on N-gram Language Models and Hidden Markov Models. Linguistic context significantly boost KWS performance with respect to methods which ignore word contexts and/or rely on image-matching with pre-segmented isolated words. On the other hand, WG-based KWS can be significantly faster than other KWS approaches which directly work on the original images where, in general, computational demands are exceedingly high. A large WG contains most of the relevant information of the original text (line) image needed for KWS but, if it is too large, the computational advantages over traditional, image matching-based KWS become diminished. Conversely, if it is too small, relevant information may be lost, leading to degraded KWS precision/recall performance. We study the trade off between WG size and KWS information retrieval performance. Results show that small, computationally cheap WGs can be used without loosing the excellent KWS performance achieved with huge WGs.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call