Abstract

BackgroundTranscriptome sequencing using next-generation sequencing platforms will soon be competing with DNA microarray technologies for global gene expression analysis. As a preliminary evaluation of these promising technologies, we performed deep sequencing of cDNA synthesized from the Microarray Quality Control (MAQC) reference RNA samples using Roche's 454 Genome Sequencer FLX.ResultsWe generated more that 3.6 million sequence reads of average length 250 bp for the MAQC A and B samples and introduced a data analysis pipeline for translating cDNA read counts into gene expression levels. Using BLAST, 90% of the reads mapped to the human genome and 64% of the reads mapped to the RefSeq database of well annotated genes with e-values ≤ 10-20. We measured gene expression levels in the A and B samples by counting the numbers of reads that mapped to individual RefSeq genes in multiple sequencing runs to evaluate the MAQC quality metrics for reproducibility, sensitivity, specificity, and accuracy and compared the results with DNA microarrays and Quantitative RT-PCR (QRTPCR) from the MAQC studies. In addition, 88% of the reads were successfully aligned directly to the human genome using the AceView alignment programs with an average 90% sequence similarity to identify 137,899 unique exon junctions, including 22,193 new exon junctions not yet contained in the RefSeq database.ConclusionUsing the MAQC metrics for evaluating the performance of gene expression platforms, the ExpressSeq results for gene expression levels showed excellent reproducibility, sensitivity, and specificity that improved systematically with increasing shotgun sequencing depth, and quantitative accuracy that was comparable to DNA microarrays and QRTPCR. In addition, a careful mapping of the reads to the genome using the AceView alignment programs shed new light on the complexity of the human transcriptome including the discovery of thousands of new splice variants.

Highlights

  • Transcriptome sequencing using next-generation sequencing platforms will soon be competing with DNA microarray technologies for global gene expression analysis

  • Using our ExpressSeq pipeline over 90% of the reads mapped to the human genome and 64% of the reads mapped to the RefSeq database of well annotated genes using BLAST with e-values ≤ 10-20, corresponding to at least 50 perfect match bases. (Because of the long read lengths, more stringent e-values, eg. 10-50 or 10-100, could be used for the BLAST searches, these would begin to miss reads that partially hit exons that are not already contained in RefSeq.) By counting the numbers of reads that map to individual genes we can measure the gene expression levels in the two samples to evaluate the Microarray Quality Control (MAQC) quality metrics and compare the results with DNA microarrays and Quantitative RealTime PCR (QRTPCR)

  • The reads were mapped to the human genome and to the RefSeq database of well-annotated RNA sequences using the ExpressSeq pipeline implemented on a Windows Desktop

Read more

Summary

Introduction

Transcriptome sequencing using next-generation sequencing platforms will soon be competing with DNA microarray technologies for global gene expression analysis. The MAQC study provided a set of reference RNA samples with large numbers of differentially expressed genes consisting of the commercially available A sample from pooled human cell lines and the B sample from a pooled human brain preparation These two samples were exhaustively analyzed on a number of different whole genome microarray platforms and Quantitative RealTime PCR (QRTPCR) [1,2]. In addition thousands of new exon junctions were identified spanning standard introns in the genome that have not been previously annotated in any public databases These additional results speak to the great promise of deep transcriptome sequencing to rapidly shed new light on the complexity of the eukaryotic transcriptome

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call