Abstract

In this paper, we propose a Multi-scale Multi-attention Network (MsMa-Net) to binarize document images contaminated by moiré patterns from camera-captured screens. Given a polluted image, MsMa-Net first learns to distinguish clean features from contaminated ones at different spatial scales via a Multi-scale feature extraction submodule (Ms-sub). In this way, detailed text information could be preserved as much as possible. Meanwhile, moiré patterns could be purified preliminarily. Then, obtained multi-scale features are adaptively interweaved through a proposed Multi-attention submodule (Ma-sub) at the channel level, the spatial level, and the correlation level, respectively. By modelling such relationships among multi-scale features, Ma-sub can further highlight text contents and suppress moiré patterns for yielding clean demoiré document images. All the demoiré images flow to a proposed Binarization submodule (Bi-sub) to produce final high-quality binarized document images. Besides, considering the scarce data support for the moiré document image binarization task, we create a new Moiré Document Image (MoDI) dataset for training and evaluating the proposed model. Extensive experiments demonstrate that MsMa-Net achieves state-of-the-art performance over several available datasets and MoDI dataset.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call