Abstract

The increasing number and complexity of structures containing RNA chains in the Protein Data Bank (PDB) have led to the need for automated structure annotation methods to replace or complement expert visual curation. This is especially true when searching for tertiary base motifs and substructures. Such base arrangements and motifs have diverse roles that range from contributions to structural stability to more direct involvement in the molecule’s functions, such as the sites for ligand binding and catalytic activity. We review the utility of computational approaches in annotating RNA tertiary base motifs in a dataset of PDB structures, particularly the use of graph theoretical algorithms that can search for such base motifs and annotate them or find and annotate clusters of hydrogen-bond-connected bases. We also demonstrate how such graph theoretical algorithms can be integrated into a workflow that allows for functional analysis and comparisons of base arrangements and sub-structures, such as those involved in ligand binding. The capacity to carry out such automatic curations has led to the discovery of novel motifs and can give new context to known motifs as well as enable the rapid compilation of RNA 3D motifs into a database.

Highlights

  • The deposition of RNA structure coordinate data in the central repository of biological macromolecular structures, the Protein Data Bank (PDB) [1], has lagged behind that of proteins

  • As the diversity and volume of RNA structures available in the PDB increase, efficient computational tools that can process such coordinate data in a high-throughput manner to allow for the discovery of novel motifs and to annotate known arrangements will be needed

  • We present and discuss the utility of a computer program that can annotate 3D base arrangements in RNA structures and a computer program that can annotate networks of RNA base clusters that are inter-connected by hydrogen bonds

Read more

Summary

Introduction

The deposition of RNA structure coordinate data in the central repository of biological macromolecular structures, the Protein Data Bank (PDB) [1], has lagged behind that of proteins. As the diversity and volume of RNA structures available in the PDB increase, efficient computational tools that can process such coordinate data in a high-throughput manner to allow for the discovery of novel motifs and to annotate known arrangements will be needed. These tools would have the added requirement of being able to compare the presence of base sub-structures in different structures, including the large and complex structures of the ribosomal assemblies. We present methods and protocols to annotate known 3D base arrangements, identify novel motifs, and compare the presence of such tertiary arrangements in different RNA structures

Algorithms for Annotating RNA 3D Base Arrangements
Comparison of Computational Approaches in Annotating RNA Base Motifs
Literature survey
Annotation of Tertiary Base Arrangement Using the NASSAM Computer Program
Workflows for Annotating RNA 3D Base Arrangements
Searching for Novel RNA Base Motifs
Application of 3D Base Arrangement Searching to Identify Functional Sites
Application of 3D Base Arrangement Comparisons to Identify Pseudoknots
A Database of RNA Base Interactions
Conclusions and Future Directions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call