Abstract

Binarization of document images is an important processing step for document images analysis and recognition. However, this problem is quite challenging in some cases because of the quality degradation of document images, such as varying illumination, complicated backgrounds, image noises due to ink spots, water stains or document creases. In this paper, we propose a framework based on deep convolutional neural-network (DCNN) for adaptive binarization of degraded document images. The basic idea of our method is to decompose a degraded document image into a spatial pyramid structure by using DCNN, with each layer at different scale. Then the foreground image is sequentially reconstructed from these layers in a coarse-to-fine manner by using deconvolutional network. Such kind of decomposition is quite beneficial, since multi-resolution supervision information can be directly introduced into network learning. We also define several loss functions about label consistency and foregrounds smoothing to further regularize the training of the network. Experimental results demonstrate the effectiveness of the proposed method.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call