Abstract

“A Mathematical Theory of Communication” was published in 1948 by Claude Shannon to address the problems in the field of data compression and communication over (noisy) communication channels. Since then, the concepts and ideas developed in Shannon’s work have formed the basis of information theory, a cornerstone of statistical learning and inference, and has been playing a key role in disciplines such as physics and thermodynamics, probability and statistics, computational sciences and biological sciences. In this article we review the basic information theory based concepts and describe their key applications in multiple major areas of research in computational biology—gene expression and transcriptomics, alignment-free sequence comparison, sequencing and error correction, genome-wide disease-gene association mapping, metabolic networks and metabolomics, and protein sequence, structure and interaction analysis.

Highlights

  • Information theory has its roots in communication systems that deal with data compression and coding theorems for transmission of information from one source to another over noisy channels.Claude Shannon’s seminal work in communication theory from 1948 [1] provided the mathematical foundations for quantification and representation of information that made today’s digital era possible.It introduced the concept of channel capacity, defining the amount of information that can be sent over a noisy channel, bounded by the maximum possible transmission rate—“Shannon’s Limit” and stated that it is possible to transmit information through a noisy channel at a rate less than the maximum channel capacity keeping the probability of error at the receiver’s end arbitrarily small [2]

  • We aim for the following—(1) we revisit some topics from previous reviews covering key newer entropy and information theory based approaches in those areas; (2) we discuss information theory based measures of multivariate gene–gene interactions and survey articles using them in genome-wide disease-gene association mapping; (3) we offer a broad summary of information theory based applications by including discussion on several key and recent uses of information theory in topics within computational biology that were not collectively reviewed before

  • Since the seminal work done by Shannon over seventy years ago, information theory with its foundations in statistical mechanics and communication theory, has made a tremendous impact to computational biology

Read more

Summary

Introduction

Information theory has its roots in communication systems that deal with data compression and coding theorems for transmission of information from one source to another over noisy channels. Entropy 2020, 22, 627 many theoretical advancements and witnessed myriad applications dealing with biological data This is true in the umbrella field of computational biology and bioinformatics that deals with computational applications of mathematical and statistical methods in the study of biological systems and processes. In this domain, information theory is widely used for model development and data analysis for a variety of biologically derived data types ranging from molecular, sequence and phenotypic data in genomics and genetics to gene expression, protein and spectral data in transcriptomics, proteomics and metabolomics, respectively [4,5,6,7,8,9,10,11]. Because of its mathematical nature, the first part will help to provide a uniform vocabulary and mathematical symbols to explain the applications discussed in the second part of the article

Self-Information and Entropy
Conditional Entropy
Relative Entropy
Mutual Information
Interaction Information
Gene Expression and Transcriptomics
Alignment-Free Sequence Comparison
Sequencing and Error Correction
Genome-Wide Disease-Gene Association Mapping
Metabolic Networks and Metabolomics
Optimization in Biology
Dimensionality Reduction for Omics Analysis
Discussion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call