Abstract

BackgroundMethods of determining whether or not any particular HIV-1 sequence stems - completely or in part - from some unknown HIV-1 subtype are important for the design of vaccines and molecular detection systems, as well as for epidemiological monitoring. Nevertheless, a single algorithm only, the Branching Index (BI), has been developed for this task so far. Moving along the genome of a query sequence in a sliding window, the BI computes a ratio quantifying how closely the query sequence clusters with a subtype clade. In its current version, however, the BI does not provide predicted boundaries of unknown fragments.ResultsWe have developed Unknown Subtype Finder (USF), an algorithm based on a probabilistic model, which automatically determines which parts of an input sequence originate from a subtype yet unknown. The underlying model is based on a simple profile hidden Markov model (pHMM) for each known subtype and an additional pHMM for an unknown subtype. The emission probabilities of the latter are estimated using the emission frequencies of the known subtypes by means of a (position-wise) probabilistic model for the emergence of new subtypes. We have applied USF to SIV and HIV-1 sequences formerly classified as having emerged from an unknown subtype. Moreover, we have evaluated its performance on artificial HIV-1 recombinants and non-recombinant HIV-1 sequences. The results have been compared with the corresponding results of the BI.ConclusionsOur results demonstrate that USF is suitable for detecting segments in HIV-1 sequences stemming from yet unknown subtypes. Comparing USF with the BI shows that our algorithm performs as good as the BI or better.

Highlights

  • Methods of determining whether or not any particular human immunodeficiency virus-1 (HIV-1) sequence stems - completely or in part - from some unknown HIV-1 subtype are important for the design of vaccines and molecular detection systems, as well as for epidemiological monitoring

  • As HIV-1 is, one of the genetically most variable viruses and genomic recombinations are frequent in HIV-1 [4], the task of classifying corresponding viral sequence data is a challenging one

  • Results we present the results of i) the calibration of Unknown Subtype Finder (USF) on a) artificial HIV-1 recombinants and b) nonrecombinant HIV-1 sequences designated as having emerged from a known subtype, ii) the application of USF to a) SIV sequences and b) sequences designated as unknown in the LANL HIV database, and iii) the comparison of USF and Branching Index (BI)

Read more

Summary

Introduction

Methods of determining whether or not any particular HIV-1 sequence stems - completely or in part - from some unknown HIV-1 subtype are important for the design of vaccines and molecular detection systems, as well as for epidemiological monitoring. An accurate and reliable classification of viral sequences data for human immunodeficiency virus-1 (HIV-1) and some other viruses of interest is important for epidemiological studies. It facilitates the understanding of the influence of genetic diversity on host immune response and provides therapeutic decision support [1,2,3]. To our knowledge, only one algorithm, called the Branching Index (BI), has been developed for deciding whether an HIV-1 sequence in question stems - completely or in part - from a subtype still unknown [15,16]. Notice that it is impossible to deduce unknown sequence segments using an existing subtype classification method, based on a probabilistic model such as jpHMM, and to identify regions of low a posteriori probabilities for all of the well known subfamilies (see paragraph ‘Discussion and conclusions - Miscellaneous’)

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call