G-quadruplexes refer to a large group of nucleic acid-based structures. In recent years, they have been attracting attention due to their biological roles in the telomeres and promoter regions. These structures show wide diversity in topology, however, development of methods for structural classification of G-quadruplexes has been evaded for a long time. There has been a limited number of studies aiming to bring forth a secondary structure classification method. The situation was even more complex than imagined, since the discovery of bulged and mismatched G-quadruplexes while most of the available tools fail to distinguish these non-canonical G-quadruplex motifs. Moreover, the interpretation of their analysis output still requires expert knowledge. In this study, we propose a new method for identification of unimolecular G-Quadruplexes and classification by secondary structures based on three-dimensional structural data. Briefly, coordinates of guanines are processed to identify tetrads, loops and bulges. Then, we present the secondary structure in the form of a depiction which shows the loop types, bulges, and guanines that participate in each tetrad. Moreover, CIIS-GQ identifies non-guanine nucleotides that joins the G-tetrads and forms multiplets. Finally, the results of our study are compared with DSSR and ElTetrado classification methods, and the advantages of the proposed depiction method for representing secondary structures were discussed. The source code of the method can be accessed via https://github.com/TugayDirek/CIIS-GQ .
Read full abstract