Abstract

BackgroundWe applied the Virtual Northern technique to human brain mRNA to systematically measure human mRNA transcript lengths on a genome-wide scale.Methodology/Principal FindingsWe used separation by gel electrophoresis followed by hybridization to cDNA microarrays to measure 8,774 mRNA transcript lengths representing at least 6,238 genes at high (>90%) confidence. By comparing these transcript lengths to the Refseq and H-Invitational full-length cDNA databases, we found that nearly half of our measurements appeared to represent novel transcript variants. Comparison of length measurements determined by hybridization to different cDNAs derived from the same gene identified clones that potentially correspond to alternative transcript variants. We observed a close linear relationship between ORF and mRNA lengths in human mRNAs, identical in form to the relationship we had previously identified in yeast. Some functional classes of protein are encoded by mRNAs whose untranslated regions (UTRs) tend to be longer or shorter than average; these functional classes were similar in both human and yeast.Conclusions/SignificanceHuman transcript diversity is extensive and largely unannotated. Our length dataset can be used as a new criterion for judging the completeness of cDNAs and annotating mRNA sequences. Similar relationships between the lengths of the UTRs in human and yeast mRNAs and the functions of the proteins they encode suggest that UTR sequences serve an important regulatory role among eukaryotes.

Highlights

  • That the human genome sequence is nearly complete [1,2,3], the step is to characterize the organization, function, and diversity of the human genome

  • We separated human brain mRNA by length on an agarose gel, sliced the gel into 50 narrow sections each containing RNA from a small range of lengths, and hybridized the RNA from each slice to a separate cDNA microarray (Figure 1)

  • The data for each cDNA from all 50 microarrays were combined into a profile that peaks in the slice, or slices, that contain mRNAs complementary to a given cDNA sequence represented on the microarray (Figure 2)

Read more

Summary

Introduction

That the human genome sequence is nearly complete [1,2,3], the step is to characterize the organization, function, and diversity of the human genome. Full-length cDNA sequencing projects provide the basis for virtually all human gene identification and analysis, they suffer from several limitations. The small numbers of cDNA clones representing most genes makes estimates of the relative abundance of transcripts from tissue to tissue, and variant to variant, unreliable Due to these limitations, it is unlikely that the goal of completely characterizing the human transcriptome, including all transcript variants across all tissues, disease states, and developmental stages, will be accomplished by full-length cDNA sequencing alone. We used separation by gel electrophoresis followed by hybridization to cDNA microarrays to measure 8,774 mRNA transcript lengths representing at least 6,238 genes at high (.90%) confidence. Similar relationships between the lengths of the UTRs in human and yeast mRNAs and the functions of the proteins they encode suggest that UTR sequences serve an important regulatory role among eukaryotes

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call