R-loops and guanine quadruplexes (G4s) are secondary structures of nucleic acids that are ubiquitously present in cells and are enriched in promoter regions of genes. By employing a bioinformatic approach based on overlap analysis of transcription factor chromatin immunoprecipitation sequencing (ChIP-seq) data sets, we found that many splicing factors, including U2AF1 whose recognition of the 3' splicing site is crucial for pre-mRNA splicing, exhibit pronounced enrichment at endogenous R-loop- and DNA G4-structure loci in promoter regions of human genes. We also revealed that U2AF1 binds directly to R-loops and DNA G4 structures at a low-nM binding affinity. Additionally, we showed the ability of U2AF1 to undergo phase separation, which could be stimulated by binding with R-loops, but not duplex DNA, RNA/DNA hybrid, DNA G4, or single-stranded RNA. We also demonstrated that U2AF1 binds to promoter R-loops in human cells, and this binding competes with U2AF1's interaction with 3' splicing site and leads to augmented distribution of RNA polymerase II (RNAPII) to promoters over gene bodies, thereby modulating cotranscriptional pre-mRNA splicing. Together, we uncovered a group of candidate proteins that can bind to both R-loops and DNA G4s, revealed the direct and strong interactions of U2AF1 with these nucleic acid structures, and established a biochemical rationale for U2AF1's occupancy in gene promoters. We also unveiled that interaction with R-loops promotes U2AF1's phase separation, and our work suggests that U2AF1 modulates pre-mRNA splicing by regulating RNAPII's partition in transcription initiation versus elongation.
Read full abstract