Abstract
At the end of introns, the polypyrimidine tract (Py) is often close to the 3′ AG in a consensus (Y)20NCAGgt in humans. Interestingly, we have found that they could also be separated by purine-rich elements including G tracts in thousands of human genes. These regulatory elements between the Py and 3′ AG (REPA) mainly regulate alternative 3′ splice sites (3′ SS) and intron retention. Here we show their widespread distribution and special properties across kingdoms. The purine-rich 3′ SS are found in up to about 60% of the introns among more than 1,000 species/lineages by whole genome analysis, and up to 18% of these introns contain the REPA G-tracts (REPAG) in about 0.6 million of 3′ SS in total. In particular, they are significantly enriched over their 3′ SS and genome backgrounds in metazoa and plants, and highly associated with alternative splicing of genes in diverse functional clusters. Cryptic splice sites harboring such G- and the other purine-triplets tend to be enriched (2–9 folds over the disrupted canonical 3′ SS) and aberrantly used in cancer patients carrying mutations of the SF3B1 or U2AF35, factors critical for branch point (BP) or 3′ AG recognition, respectively. Moreover, the REPAGs are significantly associated with reduced occurrences of BP motifs between the −24 and −4 positions, in particular absent between the −7 and −5 positions in several model organisms examined. The more distant BPs are associated with increased occurrences of alternative splicing in humans and zebrafish. The REPAGs appear to have evolved in a species- or phylum-specific way. Thus, there is widespread separation of the Py and 3′ AG by REPAGs that have evolved differentially. This special 3′ SS arrangement likely contributes to the generation of diverse transcript or protein isoforms in biological functions or diseases through alternative or aberrant splicing.
Highlights
Splice sites demarcate the boundaries between introns and exons for proper splicing of precursor RNA transcripts
The majority of 3 splice sites (3 SS) are comprised of the branch point (BP), polypyrimidine tract and 3 AG dinucleotides with a consensus (Y)20NCAGgt in humans based on the whole genome data (Nguyen et al, 2018)
Analysis of individual 3 SS indicates that the human REPA G-tracts (REPAG) were ‘inserted’ between the polypyrimidine tract (Py) and 3 AG mostly in the ancestors of mammalian genes during evolution (Sohail et al, 2014; Sohail and Xie, 2015b); their genomewide prevalence and relationship to the upstream BP and alternative splicing among different species remain unclear. We examine their distribution in individual 3 SS among > 1,000 Ensembl-annotated species/lineages, association with alternative splicing, their diverse host genes including those with aberrant splice sites in cancer, and association with distant BP motifs
Summary
Splice sites demarcate the boundaries between introns and exons for proper splicing of precursor RNA transcripts Their sequences are constrained by a consensus but could be highly diverse among hundreds of species (Nguyen et al, 2018). The majority of 3 splice sites (3 SS) are comprised of the branch point (BP), polypyrimidine tract and 3 AG dinucleotides with a consensus (Y)20NCAGgt in humans based on the whole genome data (Nguyen et al, 2018) These motifs are recognized by splicing complexes/factors U2 snRNP, U2AF65 and U2AF35, respectively (Black et al, 1985; Kramer, 1987; Singh et al, 1995; Merendino et al, 1999; Wu et al, 1999; Zorio and Blumenthal, 1999). Mutations of SF3B1 of the U2 snRNP complex and U2AF35 cause aberrant 3 splice site usage in leukemia, melanoma, breast and lung cancers (Przychodzen et al, 2013; Brooks et al, 2014; Darman et al, 2015; Alsafadi et al, 2016; Kesarwani et al, 2017)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.