Abstract

Ribonucleic acid (RNA) secondary structures and branching properties are important for determining functional ramifications in biology. While energy minimization of the Nearest Neighbor Thermodynamic Model (NNTM) is commonly used to identify such properties (number of hairpins, maximum ladder distance, etc.), it is difficult to know whether the resultant values fall within expected dispersion thresholds for a given energy function. The goal of this study was to construct a Markov chain capable of examining the dispersion of RNA secondary structures and branching properties obtained from NNTM energy function minimization independent of a specific nucleotide sequence. Plane trees are studied as a model for RNA secondary structure, with energy assigned to each tree based on the NNTM, and a corresponding Gibbs distribution is defined on the trees. Through a bijection between plane trees and 2-Motzkin paths, a Markov chain converging to the Gibbs distribution is constructed, and fast mixing time is established by estimating the spectral gap of the chain. The spectral gap estimate is obtained through a series of decompositions of the chain and also by building on known mixing time results for other chains on Dyck paths. The resulting algorithm can be used as a tool for exploring the branching structure of RNA, especially for long sequences, and to examine branching structure dependence on energy model parameters. Full exposition is provided for the mathematical techniques used with the expectation that these techniques will prove useful in bioinformatics, computational biology, and additional extended applications.

Highlights

  • Computational and mathematical applications play a critical role in the analysis of the structure and function of biological molecules, including ribonucleic acid (RNA)

  • The methods are divided into an overview of the RNA secondary structure Nearest Neighbor Thermodynamic Model (NNTM) plane tree model and energy functions (Section 2.1) and an all-encompassing explanation of the mathematical preliminaries that lay the foundation for the derived results and corresponding algorithms (Section 2.2)

  • We present the constructed Markov chain and corresponding algorithms devised for the sampling task and the proof of an upper bound on the relaxation time—that the chain mixes rapidly

Read more

Summary

Introduction

Computational and mathematical applications play a critical role in the analysis of the structure and function of biological molecules, including ribonucleic acid (RNA). RNA is an essential biological polymer with many roles including information transfer, regulation of gene expression, and catalysis of chemical reactions. The primary structure of an RNA molecule may be understood as a sequence of amino acids: arginine, urasil, guanine, and cytosine. We frequently abbreviate these as A, U, G, and C, respectively. RNA molecules are single-stranded and may interact with themselves, forming A–U, G–U, and G–C bonds. The secondary structure of an RNA molecule is a set of such bonds

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call