A Bioinformatics-Based Alternative mRNA Splicing Code that May Explain Some Disease Mutations Is Conserved in Animals.

Wen Qu,Barry R Zeeberg,Pablo Cingolani,Douglas M Ruden

doi:10.3389/fgene.2017.00038

Abstract

Deep sequencing of cDNAs made from spliced mRNAs indicates that most coding genes in many animals and plants have pre-mRNA transcripts that are alternatively spliced. In pre-mRNAs, in addition to invariant exons that are present in almost all mature mRNA products, there are at least 6 additional types of exons, such as exons from alternative promoters or with alternative polyA sites, mutually exclusive exons, skipped exons, or exons with alternative 5′ or 3′ splice sites. Our bioinformatics-based hypothesis is that, in analogy to the genetic code, there is an “alternative-splicing code” in introns and flanking exon sequences, analogous to the genetic code, that directs alternative splicing of many of the 36 types of introns. In humans, we identified 42 different consensus sequences that are each present in at least 100 human introns. 37 of the 42 top consensus sequences are significantly enriched or depleted in at least one of the 36 types of introns. We further supported our hypothesis by showing that 96 out of 96 analyzed human disease mutations that affect RNA splicing, and change alternative splicing from one class to another, can be partially explained by a mutation altering a consensus sequence from one type of intron to that of another type of intron. Some of the alternative splicing consensus sequences, and presumably their small-RNA or protein targets, are evolutionarily conserved from 50 plant to animal species. We also noticed the set of introns within a gene usually share the same splicing codes, thus arguing that one sub-type of splicesosome might process all (or most) of the introns in a given gene. Our work sheds new light on a possible mechanism for generating the tremendous diversity in protein structure by alternative splicing of pre-mRNAs.

Highlights

The almost invariant consensus sequence for mRNA splicing in animals and plants is gu_ag, where gu is the splice donor sequence and ag is the splice acceptor sequence
The splice acceptor consensus sequence is preceded by a branch point sequence, which contains an adenine, which is ligated to the 5′ splice site ribonucleotide to form the intron lariat, and a polypyrimidine tract (c or u), which is between the branch point and the splice acceptor sequence
To begin our bioinformatics analysis of introns, we first generated a table of paired splice donor and acceptor consensus sequences, from the most common to the least common

Summary

Introduction

The almost invariant consensus sequence for mRNA splicing in animals and plants is gu_ag, where gu is the splice donor sequence and ag is the splice acceptor sequence. An expression of “gu_ag” means that only the 5′ and 3′ terminal two nucleotides of the sequence are invariable as gu and ag, respectively, and that a sequence represented by the underscore can be any sequences. Here we use this expression to indicate that the sequence represented by the underscore can be any sequences except for sequences that do not match any of the other consensus sequences. The splice acceptor consensus sequence is preceded by a branch point sequence, which contains an adenine, which is ligated to the 5′ splice site ribonucleotide to form the intron lariat, and a polypyrimidine tract (c or u), which is between the branch point and the splice acceptor sequence. The flanking one or two nucleotides on either side of the intron are often conserved, and they are included in our supplementary tables, but they will not be discussed further in this paper so that we can focus our analyses on consensus sequences at the ends of the introns

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Frontiers in genetics	Publication Date: Apr 11, 2017
Citations: 14	License type: cc-by

R Discovery Prime

R Discovery Prime

A Bioinformatics-Based Alternative mRNA Splicing Code that May Explain Some Disease Mutations Is Conserved in Animals.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Frontiers in genetics

Lead the way for us

Similar Papers

Regulatory Roles of Heterogeneous Nuclear Ribonucleoprotein M and Nova-1 Protein in Alternative Splicing of Dopamine D2 Receptor Pre-mRNA
Eonyoung Park ... Kyungjin Kim
Journal of Biological Chemistry | VOL. 286
Eonyoung Park, et. al.Eonyoung Park ... Kyungjin Kim
01 Jul 2011
Journal of Biological Chemistry | VOL. 286

Splicing Regulation in Neurologic Disease
Donny D Licatalosi ... Robert B Darnell
Neuron | VOL. 52
Donny D Licatalosi, et. al.Donny D Licatalosi ... Robert B Darnell
01 Oct 2006
Neuron | VOL. 52

Alternative Splicing: New Insights from Global Analyses
Benjamin J Blencowe
Cell | VOL. 126
Benjamin J BlencoweBenjamin J Blencowe
01 Jul 2006
Cell | VOL. 126

Bipartite functions of the CREB co-activators selectively direct alternative splicing or transcriptional activation
Antonio L Amelio ... Michael D Conkright
The EMBO Journal | VOL. 28
Antonio L Amelio, et. al.Antonio L Amelio ... Michael D Conkright
30 Jul 2009
The EMBO Journal | VOL. 28

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Bioinformatics-Based Alternative mRNA Splicing Code that May Explain Some Disease Mutations Is Conserved in Animals.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Frontiers in genetics