Abstract

BackgroundSerial Analysis of Gene Expression (SAGE) and microarrays have found awidespread application, but much ambiguity exists regarding the evaluation of these technologies. Cross-platform utilization of gene expression data from the SAGE and microarray technology could reduce the need for duplicate experiments and facilitate a more extensive exchange of data within the research community. This requires a measure for the correspondence of the different gene expression platforms. To date, a number of cross-platform evaluations (including a few studies using SAGE and Affymetrix GeneChips) have been conducted showing a variable, but overall low, concordance. This study evaluates these overall measures and introduces the between-ratio difference as a concordance measure pergene.ResultsIn this study, gene expression measurements of Unigene clusters represented by both Affymetrix GeneChips HG-U133A and SAGE were compared using two independent RNA samples. After matching of the data sets the final comparison contains a small data set of 1094 unique Unigene clusters, which is unbiased with respect to expression level. Different overall correlation approaches, like Up/Down classification, contingency tables and correlation coefficients were used to compare both platforms. In addition, we introduce a novel approach to compare two platforms based on the calculation of differences between expression ratios observed in each platform for each individual transcript. This approach results in a concordance measure per gene (with statistical probability value), as opposed to the commonly used overall concordance measures between platforms.ConclusionWe can conclude that intra-platform correlations are generally good, but that overall agreement between the two platforms is modest. This might be due to the binomially distributed sampling variation in SAGE tag counts, SAGE annotation errors and the intensity variation between probe sets of a single gene in Affymetrix GeneChips. We cannot identify or advice which platform performs better since both have their (dis)-advantages. Therefore it is strongly recommended to perform follow-up studies of interesting genes using additional techniques. The newly introduced between-ratio difference is a filtering-independent measure for between-platform concordance. Moreover, the between-ratio difference per gene can be used to detect transcripts with similar regulation on both platforms.

Highlights

  • Serial Analysis of Gene Expression (SAGE) and microarrays have found awidespread application, but much ambiguity exists regarding the evaluation of these technologies

  • The annotation of the short 10 bp sequence tags may identify more than one transcript. This can be overcome by using LongSAGE libraries that contain 17 bp tags which can be more reliably mapped to Unigene clusters or the complete genome sequence [16]

  • SAGE tags were converted into Unigene clusters using data from the CGAP website

Read more

Summary

Introduction

Serial Analysis of Gene Expression (SAGE) and microarrays have found awidespread application, but much ambiguity exists regarding the evaluation of these technologies. Cross-platform utilization of gene expression data from the SAGE and microarray technology could reduce the need for duplicate experiments and facilitate a more extensive exchange of data within the research community. Fruitful and still in use, have a limited scope with regard to the number of genes that can be analyzed simultaneously Because of this limitation, new methods have been developed, including serial analysis of gene expression (SAGE) [7], massive parallel signature sequencing (MPSS) [8], cDNA and oligo microarray chip technologies [9,10,11,12,13] and Affymetrix GeneChips [11]. SAGE is not suitable for highthroughput analyses of multiple samples

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call