Abstract

Next generation sequencing technologies, like ultra-deep pyrosequencing (UDPS), allows detailed investigation of complex populations, like RNA viruses, but its utility is limited by errors introduced during sample preparation and sequencing. By tagging each individual cDNA molecule with barcodes, referred to as Primer IDs, before PCR and sequencing these errors could theoretically be removed. Here we evaluated the Primer ID methodology on 257,846 UDPS reads generated from a HIV-1 SG3Δenv plasmid clone and plasma samples from three HIV-infected patients. The Primer ID consisted of 11 randomized nucleotides, 4,194,304 combinations, in the primer for cDNA synthesis that introduced a unique sequence tag into each cDNA molecule. Consensus template sequences were constructed for reads with Primer IDs that were observed three or more times. Despite high numbers of input template molecules, the number of consensus template sequences was low. With 10,000 input molecules for the clone as few as 97 consensus template sequences were obtained due to highly skewed frequency of resampling. Furthermore, the number of sequenced templates was overestimated due to PCR errors in the Primer IDs. Finally, some consensus template sequences were erroneous due to hotspots for UDPS errors. The Primer ID methodology has the potential to provide highly accurate deep sequencing. However, it is important to be aware that there are remaining challenges with the methodology. In particular it is important to find ways to obtain a more even frequency of resampling of template molecules as well as to identify and remove artefactual consensus template sequences that have been generated by PCR errors in the Primer IDs.

Highlights

  • Ultra-deep pyrosequencing (UDPS) is one application of 454 next-generation sequencing (NGS) that has been used for identification of minority variants, for example in HIV populations resistant to antiretroviral drugs [1,2,3,4,5]

  • Primer IDs in Next-Generation Sequencing high error rate of the 454 sequencing technology and by errors introduced during cDNA synthesis and PCR amplification prior to 454 sequencing [3, 6,7,8,9]. 454 sequencing errors primarily involve insertions and deletions in homopolymeric regions, i.e. stretches of identical nucleotides in the target sequence [3, 6,7,8,9]

  • Errors introduced during cDNA synthesis and PCR are usually single nucleotide substitutions, in particular transitions, are difficult to correct by post-sequencing data cleaning procedures [7]

Read more

Summary

Introduction

Ultra-deep pyrosequencing (UDPS) is one application of 454 next-generation sequencing (NGS) that has been used for identification of minority variants, for example in HIV populations resistant to antiretroviral drugs [1,2,3,4,5]. 454 sequencing errors primarily involve insertions and deletions (indels) in homopolymeric regions, i.e. stretches of identical nucleotides in the target sequence [3, 6,7,8,9] The impact of these errors can be partially alleviated by using post-sequencing data cleaning procedures [7, 10, 11]. Errors introduced during cDNA synthesis and PCR are usually single nucleotide substitutions, in particular transitions, are difficult to correct by post-sequencing data cleaning procedures [7]. This latter type of errors is relevant to other NGS platforms like Illumina, Ion Torrent and Pacific Biosciences, that currently is replacing the 454 platform that we used in the present study. Thermoscript (Life Technologies, Stockholm, Sweden) was added and cDNA synthesized by incubation at 42°C 15 min, 50°C 30 min, 85°C 5 min and 4°C according to the according to the manufacturer’s instructions. cDNA synthesis was done in five parallel reactions to allow reverse transcription of all available RNA

Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.