Abstract

The expression microarray is a frequently used approach to study gene expression on a genome-wide scale. However, the data produced by the thousands of microarray studies published annually are confounded by “batch effects,” the systematic error introduced when samples are processed in multiple batches. Although batch effects can be reduced by careful experimental design, they cannot be eliminated unless the whole study is done in a single batch. A number of programs are now available to adjust microarray data for batch effects prior to analysis. We systematically evaluated six of these programs using multiple measures of precision, accuracy and overall performance. ComBat, an Empirical Bayes method, outperformed the other five programs by most metrics. We also showed that it is essential to standardize expression data at the probe level when testing for correlation of expression profiles, due to a sizeable probe effect in microarray data that can inflate the correlation among replicates and unrelated samples.

Highlights

  • Gene expression microarray technology [1,2,3,4] measures the expression of thousands of genes in a single assay, using multiple probes to assay each transcript

  • The principal components (PCs) identified in the principal component analysis (PCA) that together account for a predetermined proportion of variation, here 60%, are retained for the variance component analysis (VCA)

  • The variation in each PC is weighted by its eigenvalue from PCA, and the resulting value represents the overall variation explained by that component

Read more

Summary

Introduction

Gene expression microarray technology [1,2,3,4] measures the expression of thousands of genes in a single assay, using multiple probes to assay each transcript. It is a revolutionary tool for identifying genes or pathways whose expression changes in response to specific perturbations. Promising as it is, there are concerns regarding the reliability, and the utility, of DNA microarray technology in the study of physiological processes and diseases [5,6]. The term ‘‘batch’’ refers to microarrays processed at one site over a short period of time using the same platform. The cumulative error introduced by these time and place-dependent experimental variations is referred to as ‘‘batch effects."

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call