Abstract

BackgroundPlastid genome sequence information is vital to several disciplines in plant biology, including phylogenetics and molecular biology. The past five years have witnessed a dramatic increase in the number of completely sequenced plastid genomes, fuelled largely by advances in conventional Sanger sequencing technology. Here we report a further significant reduction in time and cost for plastid genome sequencing through the successful use of a newly available pyrosequencing platform, the Genome Sequencer 20 (GS 20) System (454 Life Sciences Corporation), to rapidly and accurately sequence the whole plastid genomes of the basal eudicot angiosperms Nandina domestica (Berberidaceae) and Platanus occidentalis (Platanaceae).ResultsMore than 99.75% of each plastid genome was simultaneously obtained during two GS 20 sequence runs, to an average depth of coverage of 24.6× in Nandina and 17.3× in Platanus. The Nandina and Platanus plastid genomes shared essentially identical gene complements and possessed the typical angiosperm plastid structure and gene arrangement. To assess the accuracy of the GS 20 sequence, over 45 kilobases of sequence were generated for each genome using conventional sequencing. Overall error rates of 0.043% and 0.031% were observed in GS 20 sequence for Nandina and Platanus, respectively. More than 97% of all observed errors were associated with homopolymer runs, with ~60% of all errors associated with homopolymer runs of 5 or more nucleotides and ~50% of all errors associated with regions of extensive homopolymer runs. No substitution errors were present in either genome. Error rates were generally higher in the single-copy and noncoding regions of both plastid genomes relative to the inverted repeat and coding regions.ConclusionHighly accurate and essentially complete sequence information was obtained for the Nandina and Platanus plastid genomes using the GS 20 System. More importantly, the high accuracy observed in the GS 20 plastid genome sequence was generated for a significant reduction in time and cost over traditional shotgun-based genome sequencing techniques, although with approximately half the coverage of previously reported GS 20 de novo genome sequence. The GS 20 should be broadly applicable to angiosperm plastid genome sequencing, and therefore promises to expand the scale of plant genetic and phylogenetic research dramatically.

Highlights

  • Plastid genome sequence information is vital to several disciplines in plant biology, including phylogenetics and molecular biology

  • Plastid genome sequence information is of central importance to several fields of plant biology, including phylogenetics, molecular biology and evolution, and plastid genetic engineering [1,2,3,4,5,6]

  • Perhaps the most promising of these new technologies involves the Genome Sequencer 20 (GS 20) System, a pyrosequencing platform developed by the 454 Life Sciences Corporation (Branford, CT, USA; available through Roche Diagnostics, Indianapolis, IN, USA)

Read more

Summary

Introduction

Plastid genome sequence information is vital to several disciplines in plant biology, including phylogenetics and molecular biology. More than 50 complete plastid genomes are available on GenBank, and several plastid genome sequencing projects [8,9,10] promise to increase that number to more than 200 in the near future This dramatic growth in plastid genome sequencing has been driven largely by improvements in Sanger sequencing technology that have greatly reduced the time and cost involved in genome sequencing [11]. The GS 20 System implements several novel technologies that allow for relatively rapid and inexpensive pyrosequencing on a massive scale [14] These include an emulsion-based method to amplify random fragment libraries of template DNA in bulk, fiber-optic slides containing high-density, picoliter-sized pyrosequencing reactors, and a three-bead system to deliver the enzymes necessary for the pyrosequencing reactions. The savings in time and money associated with GS 20 de novo genome sequence comes at the cost of a slightly higher error rate compared to traditional Sangerbased genome sequence (~0.04% in GS 20 vs. 0.01% in Sanger sequence) [14,16,17]

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call