Abstract

Most protein-encoding genes in eukaryotes contain introns, which are interwoven with exons. Introns need to be removed from initial transcripts in order to generate the final messenger RNA (mRNA), which can be translated into an amino acid sequence. Precise excision of introns by the spliceosome requires conserved dinucleotides, which mark the splice sites. However, there are variations of the highly conserved combination of GT at the 5′ end and AG at the 3′ end of an intron in the genome. GC-AG and AT-AC are two major non-canonical splice site combinations, which have been known for years. Recently, various minor non-canonical splice site combinations were detected with numerous dinucleotide permutations. Here, we expand systematic investigations of non-canonical splice site combinations in plants across eukaryotes by analyzing fungal and animal genome sequences. Comparisons of splice site combinations between these three kingdoms revealed several differences, such as an apparently increased CT-AC frequency in fungal genome sequences. Canonical GT-AG splice site combinations in antisense transcripts are a likely explanation for this observation, thus indicating annotation errors. In addition, high numbers of GA-AG splice site combinations were observed in Eurytemora affinis and Oikopleura dioica. A variant in one U1 small nuclear RNA (snRNA) isoform might allow the recognition of GA as a 5′ splice site. In depth investigation of splice site usage based on RNA-Seq read mappings indicates a generally higher flexibility of the 3′ splice site compared to the 5′ splice site across animals, fungi, and plants.

Highlights

  • The removal of introns from initial transcripts is an essential step during the generation of mature messenger RNAs in eukaryotes

  • Our findings indicate that the minor non-canonical splice site combination CT-AC occurs with a significantly (Mann-Whitney U-Test; p ≈ 0.00035) higher frequency in the annotation of fungal genome sequences than the major non-canonical splice site combination AT-AC

  • We investigated non-canonical splice sites in transcripts of Armillaria gallica, as this species shows a high number of non-canonical splice sites in the annotation and the set of fungal genome sequences has a feasible size for this analysis

Read more

Summary

Introduction

The removal of introns from initial transcripts is an essential step during the generation of mature messenger RNAs (mRNAs) in eukaryotes. This process allows variation, which provides the basis for quick adaptation to changing conditions [1,2]. Introns are energetically expensive for the cell to maintain as the transcription of introns costs time and energy and the removal of introns has to be exactly regulated [8] Dinucleotides at both intron/exon borders mark the splice sites and are highly conserved [9]. GT at the 50 end and AG at the 30 end of an intron form the canonical splice site combination on DNA level

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call