Abstract

The 3D architectures of RNAs are essential for understanding their cellular functions. While an accurate scoring function based on the statistics of known RNA structures is a key component for successful RNA structure prediction or evaluation, there are few tools or web servers that can be directly used to make comprehensive statistical analysis for RNA 3D structures. In this work, we developed RNAStat, an integrated tool for making statistics on RNA 3D structures. For given RNA structures, RNAStat automatically calculates RNA structural properties such as size and shape, and shows their distributions. Based on the RNA structure annotation from DSSR, RNAStat provides statistical information of RNA secondary structure motifs including canonical/non-canonical base pairs, stems, and various loops. In particular, the geometry of base-pairing/stacking can be calculated in RNAStat by constructing a local coordinate system for each base. In addition, RNAStat also supplies the distribution of distance between any atoms to the users to help build distance-based RNA statistical potentials. To test the usability of the tool, we established a non-redundant RNA 3D structure dataset, and based on the dataset, we made a comprehensive statistical analysis on RNA structures, which could have the guiding significance for RNA structure modeling. The python code of RNAStat, the dataset used in this work, and corresponding statistical data files are freely available at GitHub (https://github.com/RNA-folding-lab/RNAStat).

Highlights

  • RNA molecules play important roles in various biological processes, ranging from carrying genetic information, participating in protein synthesis, catalyzing biochemical reactions, and regulating gene expressions, to acting as a structural molecule in cellular organelles (Doherty and Doudna, 2001; Dethoff et al, 2012; Cech and Steitz, 2014)

  • The FARNA/FARFAR can assemble trinucleotide fragments into 3D structures corresponding to an RNA sequence with the use of the Monte Carlo algorithm and a knowledge-based energy function, and the parameters of energy function were determined from the statistical analysis of known RNA 3D structures (Das and Baker, 2007; Das et al, 2010)

  • We present the RNAStat, an integrated tool for making comprehensive statistics on RNA 3D structures

Read more

Summary

Introduction

RNA molecules play important roles in various biological processes, ranging from carrying genetic information, participating in protein synthesis, catalyzing biochemical reactions, and regulating gene expressions, to acting as a structural molecule in cellular organelles (Doherty and Doudna, 2001; Dethoff et al, 2012; Cech and Steitz, 2014). The structures deposited in Protein Data Bank (PDB) are still limited, since it is expensive and time-consuming to experimentally derive high-resolution RNA 3D structures (Rose et al, 2017; Westhof and Leontis, 2021). This situation has led to a Statistics for RNA 3D Structures great demand in structural biology to envisage the RNA structures using prediction methods (Hajdin et al, 2010; Shi Y.-Z. et al, 2014; Miao et al, 2017; Schlick and Pyle, 2017). The potential energy of our model is mainly physics-based, the potentials, especially bonded potentials, were parameterized by the statistical analysis on the available 3D structures of RNAs in PDB (Shi YZ. et al, 2014; Jin et al, 2019)

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call