Abstract

Abstract Word searching or keyword spotting is an important research problem in the domain of document image processing. The solution to the said problem for handwritten documents is more challenging than for printed ones. In this work, a two-stage word searching schema is introduced. In the first stage, all the irrelevant words with respect to a search word are filtered out from the document page image. This is carried out using a zonal feature vector, called pre-selection feature vector, along with a rule-based binary classification method. In the next step, a holistic word recognition paradigm is used to confirm a pre-selected word as search word. To accomplish this, a modified histogram of oriented gradients-based feature descriptor is combined with a topological feature vector. This method is experimented on a QUWI English database, which is freely available through the International Conference on Document Analysis and Recognition 2015 competition entitled “Writer Identification and Gender Classification.” This technique not only provides good retrieval performance in terms of recall, precision, and F-measure scores, but it also outperforms some state-of-the-art methods.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.