Abstract

This paper introduced a layout and perspective distortion independent recognition framework for captured Chinese document image. Under the framework, 1) Conditional random field (CRF) is employed for text line extraction from a global point of view. As the text line extraction is layout independent it could be widely used in different type of document images 2) A text line image based perspective distortion correction method is detailed and used in three different ways. 3) The text line extraction and perspective distortion correction are combined with character recognition to construct a recognition system. On three captured document image datasets, the proposed framework improves the accuracies from 94.03% to 95.20%, 13.01% to 93.71% and 10.63% to 92.68% respectively for different distortion degrees. The experimental results demonstrate that the introduced recognition framework is promising for solving layout and perspective distortion problems in captured document image recognition.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call