Abstract

BackgroundStatistical methods for ranking differentially expressed genes (DEGs) from gene expression data should be evaluated with regard to high sensitivity, specificity, and reproducibility. In our previous studies, we evaluated eight gene ranking methods applied to only Affymetrix GeneChip data. A more general evaluation that also includes other microarray platforms, such as the Agilent or Illumina systems, is desirable for determining which methods are suitable for each platform and which method has better inter-platform reproducibility.ResultsWe compared the eight gene ranking methods using the MicroArray Quality Control (MAQC) datasets produced by five manufacturers: Affymetrix, Applied Biosystems, Agilent, GE Healthcare, and Illumina. The area under the curve (AUC) was used as a measure for both sensitivity and specificity. Although the highest AUC values can vary with the definition of "true" DEGs, the best methods were, in most cases, either the weighted average difference (WAD), rank products (RP), or intensity-based moderated t statistic (ibmT). The percentages of overlapping genes (POGs) across different test sites were mainly evaluated as a measure for both intra- and inter-platform reproducibility. The POG values for WAD were the highest overall, irrespective of the choice of microarray platform. The high intra- and inter-platform reproducibility of WAD was also observed at a higher biological function level.ConclusionThese results for the five microarray platforms were consistent with our previous ones based on 36 real experimental datasets measured using the Affymetrix platform. Thus, recommendations made using the MAQC benchmark data might be universally applicable.

Highlights

  • Statistical methods for ranking differentially expressed genes (DEGs) from gene expression data should be evaluated with regard to high sensitivity, specificity, and reproducibility

  • We recently reported that weighted average difference (WAD) outperformed average difference (AD), which was recommended by the MicroArray Quality Control (MAQC) study with regard to inter-site reproducibility

  • We reported that the use of WAD or rank products (RP), in conjunction with suitable preprocessing algorithms dedicated to the Affymetrix (AFX) GeneChip data, can increase both sensitivity and specificity of the results [14]

Read more

Summary

Introduction

Statistical methods for ranking differentially expressed genes (DEGs) from gene expression data should be evaluated with regard to high sensitivity, specificity, and reproducibility. Identification of differentially expressed genes (DEGs) under different conditions is an important goal in microarray-based gene expression analysis For this identification, new gene ranking methods have been developed and comparative studies have been performed [1,2,3,4,5,6,7,8]. The MAQC study provides a large number of benchmark datasets measured using different microarray platforms and at different test sites for a set of common samples (so-called “Samples A-D"; for details, see Materials and Methods) This enables us to evaluate inter-site (or intraplatform) and inter-platform reproducibility. Evaluations for those methods are important for creating up-to-date guidelines

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call