Abstract

We consider the problem of compressing discrete memoryless data sequences for the purpose of similarity identification, first studied by Ahlswede et al. (1997). In this setting, a source sequence is compressed, where the goal is to be able to identify whether the original source sequence is similar to another given sequence (called the query sequence). There is no requirement that the source will be reproducible from the compressed version. In the case where no false negatives are allowed, a compression scheme is said to be reliable if the probability of error (false positive) vanishes as the sequence length grows. The minimal compression rate in this sense, which is the parallel of the classical rate distortion function, is called the identification rate. The rate at which the error probability vanishes is measured by its exponent, called the identification exponent (which is the analog of the classical excess distortion exponent). While an information-theoretic expression for the identification exponent was found in past work, it is uncomputable due to a dependency on an auxiliary random variable with unbounded cardinality. The main result of this paper is a cardinality bound on the auxiliary random variable in the identification exponent, thereby making the quantity computable (solving the problem that was left open by Ahlswede et al.). The new proof technique relies on the fact that the Lagrangian in the optimization problem (in the expression for the exponent) can be decomposed by coordinate (of the auxiliary random variable). Then a standard Carathéodory - style argument completes the proof.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.