Dependence Models for Searching Text in Document Images.

Ismet Zeki Yalniz,R Manmatha

doi:10.1109/tpami.2017.2780108

Ismet Zeki Yalniz, R Manmatha

Open Access

https://doi.org/10.1109/tpami.2017.2780108

Copy DOI

Abstract

The main goal of existing word spotting approaches for searching document images has been the identification of visually similar word images in the absence of high quality text recognition output. Searching for a piece of arbitrary text is not possible unless the user identifies a sample word image from the document collection or generates the query word image synthetically. To address this problem, a Markov Random Field (MRF) framework is proposed for searching document images and shown to be effective for searching arbitrary text in real time for books printed in English (Latin script), Telugu and Ottoman scripts. The English experiments demonstrate that the dependencies between the visual terms and letter bigrams can be automatically learned using noisy OCR output. It is also shown that OCR text search accuracy can be significantly improved if it is combined with the proposed approach. No commercial OCR engine is available for Telugu or Ottoman script. In these cases the dependencies are trained using manually annotated document images. It is demonstrated that the trained model can be directly used to resolve arbitrary text queries across books despite font type and size differences. The proposed approach outperforms a state-of-the-art BLSTM baseline in these contexts.

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Transactions on Pattern Analysis and Machine Intelligence	Publication Date: Dec 6, 2017
Citations: 48	License type: publisher-specific, author manuscript

R Discovery Prime

R Discovery Prime

Dependence Models for Searching Text in Document Images.

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Pattern Analysis and Machine Intelligence

Lead the way for us

Similar Papers

Discriminative Random Fields
Sanjiv Kumar ... Martial Hebert
International Journal of Computer Vision | VOL. 68
Sanjiv Kumar, et. al.Sanjiv Kumar ... Martial Hebert
01 Apr 2006
International Journal of Computer Vision | VOL. 68

A New MRF Framework with Dual Adaptive Contexts for Image Segmentation
Ping Zhong ... Fang Liu
-
Ping Zhong, et. al.Ping Zhong ... Fang Liu
01 Dec 2007
01 Dec 2007

Discriminative random fields: a discriminative framework for contextual interaction in classification
Sanjiv Kumar ... Hebert
-
Sanjiv Kumar, et. al. Sanjiv Kumar ... Hebert
01 Jan 2003
01 Jan 2003

A Model of Hierarchical Key Assignment Scheme with CRT
Jing Zhao ... Zhigang Zhang
-
Jing Zhao, et. al.Jing Zhao ... Zhigang Zhang
01 Dec 2007
01 Dec 2007

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Dependence Models for Searching Text in Document Images.

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Pattern Analysis and Machine Intelligence