Abstract

Modern scan routines require a predefined scan resolution, whether it is customer-selected or a default value in the scanner’s settings. When the scanning process begins, the resolution cannot be changed. This results in all scanned pages, no matter how much their contents may vary, having output images of the same size. If we can determine an optimal resolution for each scanned document raster content, we can save a lot of storage. In this paper, the resolutions in question are 300 dpi, 150 dpi, and 75 dpi. We define the criteria for optimal scan resolution and propose some new features to help determine it for scanned document raster content. The features proposed are sample power spectrum mean squared error (MSE), edge density, and edge contrast. These features can reflect the truthfulness between high-resolution 300 dpi images (references) and their lowresolution (150 dpi and 75 dpi) counterparts and the intrinsic changes among them. Combining them with spatial activity, tile standard deviation (STDDEV) structural similarity index measure mean (tile-STDDEV SSIM), and tile STDDEV structural similarity index measure STDDEV (tile-STDDEV SSIM STDDEV), we can form a feature vector, which is then fed into an SVM classifier. Test result shows that we can achieve a prediction accuracy of 93.4%.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call