Abstract

RNA viruses have been used as model systems to understand the patterns and processes of molecular evolution because they have high mutation rates and are genetically diverse. Human immunodeficiency virus 1 (HIV-1), the etiological agent of acquired immune deficiency syndrome, is highly genetically diverse, and is classified into several groups and subtypes. However, it has been difficult to use its diverse sequences to establish the overall phylogenetic relationships of different strains or the trends in sequence conservation with the construction of phylogenetic trees. Our aims were to systematically characterize HIV-1 subtype evolution and to identify the regions responsible for HIV-1 subtype differentiation at the amino acid level in the Pol protein, which is often used to classify the HIV-1 subtypes. In this study, we systematically characterized the mutation sites in 2,052 Pol proteins from HIV-1 group M (144 subtype A; 1,528 subtype B; 380 subtype C), using sequence similarity networks. We also used spectral clustering to group the sequences based on the network graph structures. A stepwise analysis of the cluster hierarchies allowed us to estimate a possible evolutionary pathway for the Pol proteins. The subtype A sequences also clustered according to when and where the viruses were isolated, whereas both the subtype B and C sequences remained as single clusters. Because the Pol protein has several functional domains, we identified the regions that are discriminative by comparing the structures of the domain-based networks. Our results suggest that sequence changes in the RNase H domain and the reverse transcriptase (RT) connection domain are responsible for the subtype classification. By analyzing the different amino acid compositions at each site in both domain sequences, we found that a few specific amino acid residues (i.e., M357 in the RT connection domain and Q480, Y483, and L491 in the RNase H domain) represent the differences among the subtypes. These residues were located on the surface of the RT structure and in the vicinity of the amino acid sites responsible for RT enzymatic activity or function.

Highlights

  • Human Immunodeficiency Virus 1 (HIV-1) is a retrovirus, a specific type of RNA virus that has been widely used as a model system for studying the molecular evolution of life because it is highly adaptive and highly genetically diverse

  • Comparison of Thousands of HIV-1 Pol Sequences Based on a Network Analysis

  • The classification of and relationships between each HIV-1 subtype were determined by constructing networks based on the amino acid sequence similarities of the Pol polyprotein (Supplementary Figure 1 and Figure 1)

Read more

Summary

Introduction

Human Immunodeficiency Virus 1 (HIV-1) is a retrovirus, a specific type of RNA virus that has been widely used as a model system for studying the molecular evolution of life because it is highly adaptive and highly genetically diverse. Molecular phylogenies have shown that HIV-1 arose in humans by cross-species infection from chimpanzees at the beginning of the twentieth century (Sharp and Hahn, 2010), and the infection has spread worldwide since the latter half of the twentieth century. This lineage, which is the predominant lineage throughout the world, is called group M and is classified into nine subtypes based on their phylogenetic relationships: subtypes A, B, C, D, F, G, H, J, and K. Recombination occurs frequently between the same subtypes or between different subtypes, and plays an important role in the diversification of HIV-1 (Rambaut et al, 2004)

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call