Abstract
The use of propositional logic in the analysis of gene expression microarray data enables the interrelations of gene expression variables to be used in answering biological or clinical questions. The main objection to this approach has been that the range of values of expression for a given gene is artificially reduced to the extremes of high and low expression, losing important analytical detail. Propositional calculus demands such bipolarity; nevertheless, propositional functions are best evaluated by means of 2 2 contingency tables, and it is precisely in this evaluation that the analytical detail can be recovered by applying the principles of information theory. I introduce a method for evaluating 2 2 contingency tables, not counting the number of cases, as is regularly done, but measuring and adding the amount of information for the cases. In this way each gene expression datum has two components: (1) a bipolar value (high or low expression) and (2) the amount of information (the decrease in uncertainty provided by the message, that is, the change in entropy). The first component is used to build propositional functions, and the second is used to evaluate the predictive power of these functions. The measurement of this second component is based on the principle that the amount of information for a variable with a particular value is inversely related to the probability of obtaining that value. Application of this method to empirical data reveals relations that had been lost in the process of bipolar data conversion.
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have