Abstract

BackgroundMany potentially life-threatening infectious viruses are highly mutable in nature. Characterizing the fittest variants within a quasispecies from infected patients is expected to allow unprecedented opportunities to investigate the relationship between quasispecies diversity and disease epidemiology. The advent of next-generation sequencing technologies has allowed the study of virus diversity with high-throughput sequencing, although these methods come with higher rates of errors which can artificially increase diversity.ResultsHere we introduce a novel computational approach that incorporates base quality scores from next-generation sequencers for reconstructing viral genome sequences that simultaneously infers the number of variants within a quasispecies that are present. Comparisons on simulated and clinical data on dengue virus suggest that the novel approach provides a more accurate inference of the underlying number of variants within the quasispecies, which is vital for clinical efforts in mapping the within-host viral diversity. Sequence alignments generated by our approach are also found to exhibit lower rates of error.ConclusionsThe ability to infer the viral quasispecies colony that is present within a human host provides the potential for a more accurate classification of the viral phenotype. Understanding the genomics of viruses will be relevant not just to studying how to control or even eradicate these viral infectious diseases, but also in learning about the innate protection in the human host against the viruses.

Highlights

  • Many potentially life-threatening infectious viruses are highly mutable in nature

  • Simulations results The performance of QuasQ in detecting true polymorphic sites is first measured based by: (i) how many true polymorphic sites have been detected by QuasQ; and (ii) how many of the detected polymorphic sites are true polymorphic sites

  • These findings strongly indicate that a sequence alignment strategy for viruses that utilizes the base quality scores ends up inferring a lower number of variants in a quasispecies, as this will down-weigh or even remove the contribution of polymorphic sites that are mainly attributed to base calling errors

Read more

Summary

Introduction

Many potentially life-threatening infectious viruses are highly mutable in nature. Characterizing the fittest variants within a quasispecies from infected patients is expected to allow unprecedented opportunities to investigate the relationship between quasispecies diversity and disease epidemiology. The advent of nextgeneration sequencing technologies has allowed the study of virus diversity with high-throughput sequencing, these methods come with higher rates of errors which can artificially increase diversity. Quasispecies are associated with the error-prone replications, high mutation rates and short generation times of the evolutionary dynamics of viruses, generating the genetic diversity that allows the species to persist in their hosts [2]. NGS techniques have been widely applied in de-novo sequencing, resequencing, metagenomics and intra-host characterization of infections pathogens. These techniques produce more sequencing fragments as compared to traditional Sanger sequencing, allowing more details with a much higher coverage. Confounding results may be derived due to the PCR step, shorter read lengths and higher sequencing error rates of these sequencing fragments

Methods
Results
Discussion
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.