Abstract
G-quadruplexes are four-stranded nucleic acid structures occurring in the genomes of all living organisms and viruses. It is increasingly evident that these structures play important molecular roles; generally, by modulating gene expression and overall genome integrity. For a long period, G-quadruplexes have been studied specifically in the context of human promoters, telomeres, and associated diseases (cancers, neurological disorders). Several of the proteins for binding G-quadruplexes are known, providing promising targets for influencing G-quadruplex-related processes in organisms. Nonetheless, in plants, only a small number of G-quadruplex binding proteins have been described to date. Thus, we aimed to bioinformatically inspect the available protein sequences to find the best protein candidates with the potential to bind G-quadruplexes. Two similar glycine and arginine-rich G-quadruplex-binding motifs were described in humans. The first is the so-called “RGG motif”-RRGDGRRRGGGGRGQGGRGRGGGFKG, and the second (which has been recently described) is known as the “NIQI motif”-RGRGRGRGGGSGGSGGRGRG. Using this general knowledge, we searched for plant proteins containing the above mentioned motifs, using two independent approaches (BLASTp and FIMO scanning), and revealed many proteins containing the G4-binding motif(s). Our research also revealed the core proteins involved in G4 folding and resolving in green plants, algae, and the key plant model organism, Arabidopsis thaliana. The discovered protein candidates were annotated using STRINGdb and sorted by their molecular and physiological roles in simple schemes. Our results point to the significant role of G4-binding proteins in the regulation of gene expression in plants.
Highlights
G-quadruplexes (G4s) are secondary structures of nucleic acids that can arise in guanine-rich DNA or RNA regions [1]
Based on the presence of a G4-binding motif, we identified more than 400 proteins with a theoretical potential to bind G4 structures in Arabidopsis thaliana (555 containing the significant RGG motif, and 408 containing the significant NIQI motif)
The complete FIMO results can be found in the Supplementary Materials (Files S1 and S2)
Summary
G-quadruplexes (G4s) are secondary structures of nucleic acids that can arise in guanine-rich DNA or RNA regions [1]. Each G4 is formed by its basic units, termed guanine tetrads (see Figure 1). A single guanine tetrad consists of four guanine nucleotides interconnected by Hoogsteen base pairing. G4s are further stabilized by the positively charged monovalent ions that are localized in their central cavity [2]. There are common properties in G4s, there is great structural diversity in their composition, including in the number of planes, loop lengths, the orientation of tracts, etc. There are common properties in G4s, there is great structural diversity in their composition, including in the number of planes, loop lengths, the orientation of tracts, etc. [3,4,5].
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.