Abstract

Accurate classification of HIV-1 group M lineages, henceforth referred to as subtyping, is essential for understanding global HIV-1 molecular epidemiology. Because most HIV-1 sequencing is done for genotypic resistance testing pol gene, we sought to develop a set of geographically-stratified pol sequences that represent HIV-1 group M sequence diversity. Representative pol sequences differ from representative complete genome sequences because not all CRFs have pol recombination points and because complete genome sequences may not faithfully reflect HIV-1 pol diversity. We developed a software pipeline that compiled 6,034 one-per-person complete HIV-1 pol sequences annotated by country and year belonging to 11 pure subtypes and 70 CRFs and selected a set of sequences whose average distance to the remaining sequences is minimized for each subtype/CRF and country to generate a Geographically-Stratified set of 716 Pol Subtype/CRF (GSPS) reference sequences. We provide extensive data on pol diversity within each subtype/CRF and country combination. The GSPS reference set will also be useful for HIV-1 pol subtyping.

Highlights

  • Background & SummaryAccurate classification of HIV-1 group M lineages, referred to as subtyping, has been essential for understanding the evolution of divergent HIV-1 in the context of the global pandemic

  • HIV-1 group M sequences can be classified into many different lineages referred to as pure subtypes and circulating recombinant forms (CRFs)

  • For each distinct subtype/CRF and country combination, we characterized the extent of diversity in the pol gene and applied a partitioning around medoids (PAM) algorithm to identify the smallest number of centrally located sequences that would minimize the average distance to the closest leaf (ADCL) of the complete set of subtype/CRF/country sequences[14]

Read more

Summary

Molecular Genetic Variation

Human immunodeficiency virus 1 • Kenya • Cyprus • Tanzania • Uganda • Ukraine • Cameroon • Pakistan • Russia • Rwanda • Uzbekistan • India • Kazakhstan • Republic of South Africa • Democratic Republic of the Congo • Sweden • Kingdom of Spain • Senegal • Nigeria • Australia • Belarus • Georgia • Italy • Somalia • United States of America • Japan • China • Brazil • South Korea • Germany • Thailand • Kingdom of Denmark • Argentina • French Republic • Peru • United Kingdom • Jamaica • Canada • Colombia • Cuba • Haiti • Trinidad and Tobago • Uruguay • Dominican Republic • Hong Kong • Myanmar • Kingdom of the Netherlands • The Philippines • Paraguay • Yemen • Switzerland • Ecuador • Taiwan Province • Bolivia • Gabon • Mexico • Poland • Zambia • Botswana • Malawi • Ethiopia • Nepal • Angola • Israel • Chad • Romania • Finland • Portuguese Republic • Ghana • GuineaBissau • Belgium • Central African Republic • Viet Nam • Afghanistan • Indonesia • Iran • Estonia • Niger • Greece • Mali • Burkina Faso • Benin • Cote d'Ivoire • Saudi Arabia • Malaysia • Luxembourg • Chile • Gambia • Singapore

Background & Summary
Methods
Data Records
Technical Validation
Usage Notes
Findings
Additional Information
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call