In the human genome, CAG 3' splice sites (3'ss) are more than twice as frequent as TAG 3'ss. The greater abundance of the former has been attributed to a higher probability of exon skipping upon cytosine-to-thymine transitions at intron position -3 (-3C > T) than thymine-to-cytosine variants (-3T > C). However, molecular mechanisms underlying this bias and its clinical impact are poorly understood. In this study, base-pairing probabilities (BPPs) and RNA secondary structures were compared between CAG 3'ss that produced more skipping of downstream exons than their mutated UAG versions (termed "laggard" CAG 3'ss) and UAG 3'ss that resulted in more skipping than their mutated CAG counterparts (canonical 3'ss). The laggard CAG 3'ss showed significantly higher BPPs across intron-exon boundaries than canonical 3'ss. The difference was centered on positions -5 to -1 relative to the intron-exon junction, the region previously shown to exhibit the strongest high-resolution ultraviolet crosslinking to the small subunit of auxiliary factor of U2 snRNP (U2AF1). RNA secondary structure predictions suggested that laggard CAG 3'ss were more often sequestered in paired conformations and in longer stem structures while canonical 3'ss were more frequently unpaired. Taken together, the excess of base-pairing at 3'ss has a potential to alter the hierarchy in intrinsic splicing efficiency of human YAG 3'ss from canonical CAG > UAG to non-canonical UAG > CAG, to modify the clinical impact of transitions at this position and to change their classification from pathogenic to benign or vice versa.
Read full abstract