Abstract

Massively parallel, tag-based sequencing systems, such as the SOLiD system, hold the promise of revolutionizing the study of whole genome gene expression due to the number of data points that can be generated in a simple and cost-effective manner. We describe the development of a 5′–end transcriptome workflow for the SOLiD system and demonstrate the advantages in sensitivity and dynamic range offered by this tag-based application over traditional approaches for the study of whole genome gene expression. 5′-end transcriptome analysis was used to study whole genome gene expression within a colon cancer cell line, HT-29, treated with the DNA methyltransferase inhibitor, 5-aza-2′-deoxycytidine (5Aza). More than 20 million 25-base 5′-end tags were obtained from untreated and 5Aza-treated cells and matched to sequences within the human genome. Seventy three percent of the mapped unique tags were associated with RefSeq cDNA sequences, corresponding to approximately 14,000 different protein-coding genes in this single cell type. The level of expression of these genes ranged from 0.02 to 4,704 transcripts per cell. The sensitivity of a single sequence run of the SOLiD platform was 100–1,000 fold greater than that observed from 5′end SAGE data generated from the analysis of 70,000 tags obtained by Sanger sequencing. The high-resolution 5′end gene expression profiling presented in this study will not only provide novel insight into the transcriptional machinery but should also serve as a basis for a better understanding of cell biology.

Highlights

  • Genome-wide analysis of gene expression in different cell subpopulations provides insights into many aspects of developmental biology and physiology

  • Development of 59-end SOLiD technology (59SOLiD) sequencing technology The tag length used in 59-end SAGE technology (59SAGE) technology (19 bp) renders it difficult to identify precisely the genome position of some of the sequence tags

  • We developed a method to improve 59SAGE by the generation of 27 mer sequence tags combined with massively parallel sequencing

Read more

Summary

Introduction

Genome-wide analysis of gene expression in different cell subpopulations provides insights into many aspects of developmental biology and physiology. Established functional genomic technologies, such as DNA arrays and serial analysis of gene expression (SAGE), can identify coding and noncoding RNA transcripts, identification of genes across the whole genome is still problematic. Most unique transcripts are expressed at low levels[1,2] and fundamental cellular mechanisms cannot be identified by the limited number of genes analyzed per study. Expression profiling is usually carried out by hybridization to microarrays. This approach, while immensely useful, is not very quantitative, as it typically yields relative rather than absolute mRNA abundance, and the results are difficult to compare across different microarray platforms

Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call