Abstract

HIV-1 replicates via a low-fidelity polymerase with a high mutation rate; strong conservation of individual nucleotides is highly indicative of the presence of critical structural or functional properties. Identifying such conservation can reveal novel insights into viral behaviour. We analysed 3651 publicly available sequences for the presence of nucleic acid conservation beyond that required by amino acid constraints, using a novel scale-free method that identifies regions of outlying score together with a codon scoring algorithm. Sequences with outlying score were further analysed using an algorithm for producing local RNA folds whilst accounting for alignment properties. 11 different conserved regions were identified, some corresponding to well-known cis-acting functions of the HIV-1 genome but also others whose conservation has not previously been noted. We identify rational causes for many of these, including cis functions, possible additional reading frame usage, a plausible mechanism by which the central polypurine tract primes second-strand DNA synthesis and a conformational stabilising function of a region at the 5′ end of env.

Highlights

  • Human Immunodeficiency Virus (HIV) infection remains a significant global health burden, with an estimated 36.7 million people worldwide living with the virus and 1.0 million AIDSrelated deaths in 2016 [1]

  • We looked for coding regions of HIV-1 that change relatively little, by turning the problem of finding such regions into a problem in signal processing, and solving this using a novel analytical approach that we recently described

  • The conserved region found first in env, and the fourth region identified reading from its 50 end—HXB2 nucleotide reference 7662–8051 (NL4-3 7652–8041)—corresponds well with the well-known highly conserved Rev-response element (RRE) [21,22,23]

Read more

Summary

Introduction

Human Immunodeficiency Virus (HIV) infection remains a significant global health burden, with an estimated 36.7 million people worldwide living with the virus and 1.0 million AIDSrelated deaths in 2016 [1]. Improving our knowledge of the viral genomic structure and function permits better understanding of the viral lifecycle and host/virus interactions and may suggest interventions to target viral replication and survival. The HIV-1 genome is one of the most intensively studied genetic sequences. It contains a large number of cis-acting regions whose function depends on the structures into which the RNA folds. Many of these have been studied and solved at a secondary structure level and for some there are three-dimensional data. The whole genome has been analysed biochemically at a secondary structure level and many of the known functional regions have been mapped [2]. There are regions where the function of the individual sequence nucleotides, either directly or through coding function, is the main constraint on the sequence and primary structure dominates over secondary structural requirements

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call