Abstract

The mathematical theory behind coding and compression began a little more than 50 years ago with the publication of Claude Shannon's (1948) Mathematical Theory of Communication in the Bell Systems Technical Journal. This article laid the foundation for what is now known as information theory in a mathematical framework that is probabilistic (see, e.g., Cover and Thomas 1991; Verdui 1998); that is, Shannon modeled the signal or message process by a random process and a communication channel by a random transition matrix that may distort the message. In the five decades that followed, information theory provided fundamental limits for communication in general and coding and compression in particular. These limits, predicted by information theory under probabilistic models, are now being approached in real products such as computer modems. Because these limits or fundamental communication quantities, such as entropy and channel capacity, vary from signal process to signal process or from channel to channel, they must be estimated for each communication setup. In this sense, information theory is intrinsically statistical. Moreover, the algorithmic theory of information has inspired an extension of Shannon's ideas that provides a formal measure of information of the kind long sought in statistical inference and modeling. This measure has led to the minimum description length (MDL) principle for modeling in general and model selection in particular (Barron, Rissanen, and Yu 1998; Hansen and Yu 1998; Rissanen 1978, 1989). A coding or compression algorithm is used when one surfs the web, listens to a CD, uses a cellular phone, or works on a computer. In particular, when a music file is downloaded through the internet, a losslessly compressed file (often having a much smaller size) is transmitted instead of the original file. Lossless compression works because the music signal is statistically redundant, and this redundancy can be removed through statistical prediction. For digital signals, integer prediction can be easily done based on the past signals that are available to both the sender and receiver, and so we need to transmit only the residuals from the prediction. These residuals can be coded at a much lower rate than the original signal (see, e.g., Edler, Huang, Schuller, and Yu 2000).

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.