Abstract

We propose a method for binary object classification to make lossy compression of document images. The double-pass dictionary formation consists in a Pattern Matching & Substitution (PM&S) algorithm that firstly makes a classification of all objects using Tanimoto distance, whereas the second pass consists of a classification of just the patterns chosen on the first pass in dictionary formation. On each of the two passes is performed the refinement of the generated classes choosing the best representative member as pattern.We tested our dictionary formation method on a compression system that uses 3OT chain code to codify the chosen patterns and the Paq8l archiver to compress the resulting string. The compression ratios applied over the eight CCITT test binary images benchmarks, at 200 and 600 dpi, are better compressed against state of the art and standard methods. Comparing with JBIG2 we obtained 27% better compression level at 200 dpi and 65% at 600 dpi, comparing with DjVu's JB2 we obtained 6% better compression level at 200 dpi and 35% at 600 dpi and comparing with 3OT-Paq8l we obtained 38% better compression at 200 dpi and 30% at 600 dpi.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.