Abstract

A common practice in microarray analysis is to transform the microarray raw data (light intensity) by a logarithmic transformation, and the justification for this transformation is to make the distribution more symmetric and Gaussian-like. Since this transformation is not universally practiced in all microarray analysis, we examined whether the discrepancy of this treatment of raw data affect the "high level"analysis result. In particular, whether the differentially ex-pressed genes as obtained by t-test, regularized t-test, or logistic regression have altered rank orders due to presence or absence of the transformation. We show that as much as 20%-40% of significant genes are "discordant" (significant only in one form of the data and not in both), depending on the test being used and the threshold value for claiming significance. The t-test is more likely to be affected by logarithmic transformation than logistic regression, and regularized t-test more affected than t-test. On the other hand, the very top ranking genes (e.g. upto top 20-50 genes, depending on the test) are not affected by the logarithmic transformation.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call