Semantically Motivated Improvements for PPM Variants

S Bunton

doi:10.1093/comjnl/40.2_and_3.76

Abstract

The on-line sequence modelling algorithm ‘Prediction by Partial Matching’ (PPM) has set the performance standard in lossless data compression research since Moffat's 1990 implementation, PPMC. Despite intense research activity, only Howard's 1993 escape-count update mechanism ‘D’ has provided any consistent, order-independent performance improvement to PPMC (about 1%). Most notably, the recently introduced PPM variant, PPM*, which eliminates PPM's order bound, fails to offer compression results superior to those of PPMC with Markov order greater than four. This paper explains how to significantly improve the compression performance of any PPM variant (by 5–12%) by combining PPM's probability estimator, ‘blending’, with information-theoretic state selection. Hazards inherent to this combination are overcome by identifying the distinct semantics of the two approaches and resolving the differences using a dual-frequency update mechanism. We present and apply our percolating state selector, plus an enhancement to blending, both of which we have recently shown to independently outperform all competing techniques from the literature. We also give a minimal linear-space suffix-tree implementation of PPM and PPM*. Performance is measured in experiments run on the Calgary Corpus using our reimplementation of the original algorithms in an executable cross-product of independent model components, which permits precise control of all modelling algorithm features.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Semantically Motivated Improvements for PPM Variants

Abstract

Talk to us

Similar Papers

More From: The Computer Journal

Lead the way for us

Journal: The Computer Journal	Publication Date: Feb 1, 1997
Citations: 71

Similar Papers

The Performances of the Fixed Constraints Transform Applied in Text Compression Experimental Results and Comparisons
Radu Radescu ... Andrei Petru Barar
-
Radu Radescu, et. al.Radu Radescu ... Andrei Petru Barar
01 Jun 2018
01 Jun 2018

Bayesian state combining for context models
S Bunton
-
S BuntonS Bunton
30 Mar 1998
30 Mar 1998

Prediction by Partial Approximate Matching for Lossless Image Compression
Yong Zhang ... D Adjeroh
-
Yong Zhang, et. al. Yong Zhang ... D Adjeroh
29 Mar 2005
29 Mar 2005

Dictionary selection using partial matching
Dzung T Hoang ... Jeffrey Scott Vitter
Information Sciences | VOL. 119
Dzung T Hoang, et. al.Dzung T Hoang ... Jeffrey Scott Vitter
01 Oct 1999
Information Sciences | VOL. 119

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Semantically Motivated Improvements for PPM Variants

Abstract

Talk to us

Similar Papers

More From: The Computer Journal