Abstract

Due to the poor condition of most of historical documents, binarization is difficult to separate document image background pixels from foreground pixels. This paper proposes Convolutional Neural Networks (CNNs) based on predicted two-channel images in which CNNs are trained to classify the foreground pixels. The promising results from the use of multispectral images for semantic segmentation inspired our efforts to create a novel prediction-based two-channel image. In our method, the original image is binarized by the structural symmetric pixels (SSPs) method, and the two-channel image is constructed from the original image and its binarized image. In order to explore impact of proposed two-channel images as network inputs, we use two popular CNNs architectures, namely SegNet and U-net. The results presented in this work show that our approach fully outperforms SegNet and U-net when trained by the original images and demonstrates competitiveness and robustness compared with state-of-the-art results using the DIBCO database.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call