Abstract
Next-generation sequencing (NGS) is a valuable tool for the detection and quantification of HIV-1 variants in vivo. However, these technologies require detailed characterization and control of artificially induced errors to be applicable for accurate haplotype reconstruction. To investigate the occurrence of substitutions, insertions, and deletions at the individual steps of RT-PCR and NGS, 454 pyrosequencing was performed on amplified and non-amplified HIV-1 genomes. Artificial recombination was explored by mixing five different HIV-1 clonal strains (5-virus-mix) and applying different RT-PCR conditions followed by 454 pyrosequencing. Error rates ranged from 0.04–0.66% and were similar in amplified and non-amplified samples. Discrepancies were observed between forward and reverse reads, indicating that most errors were introduced during the pyrosequencing step. Using the 5-virus-mix, non-optimized, standard RT-PCR conditions introduced artificial recombinants in a fraction of at least 30% of the reads that subsequently led to an underestimation of true haplotype frequencies. We minimized the fraction of recombinants down to 0.9–2.6% by optimized, artifact-reducing RT-PCR conditions. This approach enabled correct haplotype reconstruction and frequency estimations consistent with reference data obtained by single genome amplification. RT-PCR conditions are crucial for correct frequency estimation and analysis of haplotypes in heterogeneous virus populations. We developed an RT-PCR procedure to generate NGS data useful for reliable haplotype reconstruction and quantification.
Highlights
Human immunodeficiency virus type 1 (HIV-1) is a highly diverse virus, on a global scale, and within individual HIV-1 infected subjects [1]
In the second set-up, the exact same plasmid preparation was used to amplify the protease gene using fusion primers that consist of a HIV-1 specific region, a multiplex identifier and either the A or B sequence required for 454 pyrosequencing
Characterizing the diversity and evolutionary dynamics of virus populations within infected hosts is of great importance
Summary
Human immunodeficiency virus type 1 (HIV-1) is a highly diverse virus, on a global scale, and within individual HIV-1 infected subjects [1]. It has been shown that low-abundant haplotypes are already present in patients shortly after infection [3,4,5,6]. Since next-generation sequencing (NGS) platforms are widely available, virus populations can be studied much faster compared to the classical methodology of single genome sequencing. These technologies require rigorous estimation of error rates and identification of error sources, especially when viral haplotypes are quantified (reviewed in [11]). Several studies have investigated the accuracy of the pyrosequencing technology, and it is well known that homopolymeric regions are the main source of insertion-deletion (indel) errors [12,13]. PCR artifacts are well known and addressed by optimizing PCR conditions and PLOS ONE | www.plosone.org
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.