Abstract

The Ion Torrent Personal Genome Machine (PGM) is a new sequencing platform that substantially differs from other sequencing technologies by measuring pH rather than light to detect polymerisation events. Using re-sequencing datasets, we comprehensively characterise the biases and errors introduced by the PGM at both the base and flow level, across a combination of factors, including chip density, sequencing kit, template species and machine. We found two distinct insertion/deletion (indel) error types that accounted for the majority of errors introduced by the PGM. The main error source was inaccurate flow-calls, which introduced indels at a raw rate of 2.84% (1.38% after quality clipping) using the OneTouch 200 bp kit. Inaccurate flow-calls typically resulted in over-called short-homopolymers and under-called long-homopolymers. Flow-call accuracy decreased with consecutive flow cycles, but we also found significant periodic fluctuations in the flow error-rate, corresponding to specific positions within the flow-cycle pattern. Another less common PGM error, high frequency indel (HFI) errors, are indels that occur at very high frequency in the reads relative to a given base position in the reference genome, but in the majority of instances were not replicated consistently across separate runs. HFI errors occur approximately once every thousand bases in the reference, and correspond to 0.06% of bases in reads. Currently, the PGM does not achieve the accuracy of competing light-based technologies. However, flow-call inaccuracy is systematic and the statistical models of flow-values developed here will enable PGM-specific bioinformatics approaches to be developed, which will account for these errors. HFI errors may prove more challenging to address, especially for polymorphism and amplicon applications, but may be overcome by sequencing the same DNA template across multiple chips.

Highlights

  • The last decade has seen dramatic advances in sequencing technology that have relied on highly-parallel optical sensing of polymerisation reactions

  • Every sequencing technology introduces unique errors and biases into the resulting DNA sequences, and understanding PGMspecific characteristics is crucial to determining suitable applications for this new technology

  • We comprehensively examine the types of errors and biases in Personal Genome Machine (PGM)-sequenced data across several experimental variables, including chip density, template kit, template DNA and across two machines

Read more

Summary

Introduction

The last decade has seen dramatic advances in sequencing technology that have relied on highly-parallel optical sensing of polymerisation reactions. These advances have substantially reduced sequencing costs, further reduction in cost is limited by the dependence of these platforms on photo-receptive sensors and their associated reagents. The PGM leverages advances in semi-conductor technology and ion-sensitive transistors to sequence DNA using only DNA polymerase and natural nucleotides, with each polymerisation event recognised by pH changes alone [1]. The PGM requires similar library preparation steps to Roche 454 shotgun libraries, where an adapter and key () is ligated to the DNA templates, and under optimal conditions, a single DNA template is affixed to a bead and clonally amplified using emulsion PCR. The beads are loaded onto the chip, where, on average, each well contains less than a single bead

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.