An approach for detecting and cleaning of struck-out handwritten text

Bidyut B Chaudhuri,Chandranath Adak

doi:10.1016/j.patcog.2016.07.032

Abstract

This paper deals with the identification and processing of struck-out texts in unconstrained offline handwritten document images. If run on the OCR engine, such texts will produce nonsense character-string outputs. Here we present a combined (a) pattern classification and (b) graph-based method for identifying such texts. In case of (a), a feature-based two-class (normal vs. struck-out text) SVM classifier is used to detect moderate-sized struck-out components. In case of (b), skeleton of the text component is considered as a graph and the strike-out stroke is identified using a constrained shortest path algorithm. To identify zigzag or wavy struck-outs, all paths are found and some properties of zigzag and wavy line are utilized. Some other types of strike-out stroke are also detected by modifying the above method. The large sized multi-word and multi-line struck-outs are segmented into smaller components and treated as above. The detected struck-out texts can then be blocked from entering the OCR engine. In another kind of application involving historical documents, page images along with their annotated ground-truth are to be generated. In this case the strike-out strokes can be deleted from the words and then fed to the OCR engine. For this purpose an inpainting-based cleaning approach is employed. We worked on 500 pages of documents and obtained an overall F-Measure of 91.56% (91.06%) in English (Bengali) script for struck-out text detection. Also, for strike-out stroke identification and deletion, the F-Measures obtained were 89.65% (89.31%) and 91.16% (89.29%), respectively.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

An approach for detecting and cleaning of struck-out handwritten text

Abstract

Talk to us

Similar Papers

More From: Pattern Recognition

Lead the way for us

Journal: Pattern Recognition	Publication Date: Jul 26, 2016
Citations: 31

Similar Papers

Segmentation of hepatic tumor from abdominal CT data using an improved support vector machine framework
Jiayin Zhou ... Wei Xiong
-
Jiayin Zhou, et. al. Jiayin Zhou ... Wei Xiong
01 Jul 2013
01 Jul 2013

Comparative Study between Two-Class SVM and One-Class SVM Classifiers for Outlier Detection for Disease Diagnosis
Kalpit R Chandpa ... Ashwini M Jani
International Journal of Data Mining And Emerging Technologies | VOL. 5
Kalpit R Chandpa, et. al.Kalpit R Chandpa ... Ashwini M Jani
01 Jan 2015
International Journal of Data Mining And Emerging Technologies | VOL. 5

Color–Texture Pattern Classification Using Global–Local Feature Extraction, an SVM Classifier, with Bagging Ensemble Post-Processing
Carlos F Navarro ... Claudio A Perez
Applied Sciences | VOL. 9
Carlos F Navarro, et. al.Carlos F Navarro ... Claudio A Perez
01 Aug 2019
Applied Sciences | VOL. 9

UpLib
William C Janssen ... Kris Popat
-
William C Janssen, et. al.William C Janssen ... Kris Popat
20 Nov 2003
20 Nov 2003

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

An approach for detecting and cleaning of struck-out handwritten text

Abstract

Talk to us

Similar Papers

More From: Pattern Recognition