Abstract

While the majority of multiexonic human genes show some evidence of alternative splicing, it is unclear what fraction of observed splice forms is functionally relevant. In this study, we examine the extent of alternative splicing in human cells using deep RNA sequencing and de novo identification of splice junctions. We demonstrate the existence of a large class of low abundance isoforms, encompassing approximately 150,000 previously unannotated splice junctions in our data. Newly-identified splice sites show little evidence of evolutionary conservation, suggesting that the majority are due to erroneous splice site choice. We show that sequence motifs involved in the recognition of exons are enriched in the vicinity of unconserved splice sites. We estimate that the average intron has a splicing error rate of approximately 0.7% and show that introns in highly expressed genes are spliced more accurately, likely due to their shorter length. These results implicate noisy splicing as an important property of genome evolution.

Highlights

  • Most mammalian mRNAs are processed from much longer precursors in a series of splicing reactions

  • We present the number of such junctions, the average number of reads spanning each junction in that class, the percentage of the junctions observed in any tissue assayed in Wang et al [2], the percentage of 59 and 39 splice sites of each junction that fall near an annotated splice site (‘‘near’’ here is defined as within 50 base pairs), and the percentage of the 59 and 39 splice sites of each junction that show strong evidence of evolutionary conservation. doi:10.1371/journal.pgen.1001236.t001

  • We have examined the extent of alternative splicing in human cells

Read more

Summary

Introduction

Most mammalian mRNAs are processed from much longer precursors in a series of splicing reactions Regulation of these splicing reactions can lead to alternatively spliced forms of mRNA from the same pre-mRNA [1], and there is considerable interest in cataloguing the functionally important transcripts of all mammalian genes. Towards this end, transcript diversity has been examined using data from full mRNA sequences, expressed sequence tags (ESTs), or high-throughput sequencing of cDNA libraries (RNA-Seq) [2,3,4,5,6]. It is hypothesized that short introns in humans (as well as in other eukaryotes) have evolved to preferentially trigger degradation via nonsense-mediated decay (NMD) mechanisms when the spliceosome fails to remove them, suggesting that such errors are common enough to exert a detectable selective pressure [14]

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call