Abstract

Pairs of nucleotides within functional nucleic acid secondary structures often display evidence of coevolution that is consistent with the maintenance of base-pairing. Here, we introduce a sequence evolution model, MESSI (Modeling the Evolution of Secondary Structure Interactions), that infers coevolution associated with base-paired sites in DNA or RNA sequence alignments. MESSI can estimate coevolution while accounting for an unknown secondary structure. MESSI can also use graphics processing unit parallelism to increase computational speed. We used MESSI to infer coevolution associated with GC, AU (AT in DNA), GU (GT in DNA) pairs in noncoding RNA alignments, and in single-stranded RNA and DNA virus alignments. Estimates of GU pair coevolution were found to be higher at base-paired sites in single-stranded RNA viruses and noncoding RNAs than estimates of GT pair coevolution in single-stranded DNA viruses. A potential biophysical explanation is that GT pairs do not stabilize DNA secondary structures to the same extent that GU pairs do in RNA. Additionally, MESSI estimates the degrees of coevolution at individual base-paired sites in an alignment. These estimates were computed for a SHAPE-MaP-determined HIV-1 NL4-3 RNA secondary structure. We found that estimates of coevolution were more strongly correlated with experimentally determined SHAPE-MaP pairing scores than three nonevolutionary measures of base-pairing covariation. To assist researchers in prioritizing substructures with potential functionality, MESSI automatically ranks substructures by degrees of coevolution at base-paired sites within them. Such a ranking was created for an HIV-1 subtype B alignment, revealing an excess of top-ranking substructures that have been previously identified as having structure-related functional importance, among several uncharacterized top-ranking substructures.

Highlights

  • The primary role of nucleic acid molecules, such as deoxyribonucleic acid (DNA) and ribonucleic acid (RNA), is to encode genetic information for storage and transfer

  • The structure information entropies were higher for the permuted data sets, with the exception of RF00003, which had marginally lower structure information entropies for both of the permuted data sets. This result is surprising as RF00003 corresponds to the U1 spliceosomal RNA, a component of a spliceosome (Burge et al 2013) with a thermodynamically stable structure

  • MESSI was developed for modeling substitutions that are consistent with the maintenance of canonical base-pairing at paired sites within alignments of DNA and RNA sequences

Read more

Summary

Introduction

The primary role of nucleic acid molecules, such as deoxyribonucleic acid (DNA) and ribonucleic acid (RNA), is to encode genetic information for storage and transfer Both types of molecules can form structures with additional functions (Mattick 2003). DNA is ordinarily thought of as a double-stranded molecule forming the iconic double helical configuration (Watson and Crick 1953), many viral genomes consist entirely of single-stranded DNA (ssDNA) or single-stranded RNA (ssRNA) molecules. Such single-stranded nucleic acid molecules are far less constrained than double-stranded ones in the variety of functional structures that they can form. This study focuses exclusively on RNA and DNA secondary structures

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call