Abstract

Next-generation genomic technology has both greatly accelerated the pace of genome research as well as increased our reliance on draft genome sequences. While groups such as the Genomics Standards Consortium have made strong efforts to promote genome standards there is a still a general lack of uniformity among published draft genomes, leading to challenges for downstream comparative analyses. This lack of uniformity is a particular problem when using standard draft genomes that frequently have large numbers of low-quality sequencing tracts. Here we present a proposal for an “enhanced-quality draft” genome that identifies at least 95% of the coding sequences, thereby effectively providing a full accounting of the genic component of the genome. Enhanced-quality draft genomes are easily attainable through a combination of small- and large-insert next-generation, paired-end sequencing. We illustrate the generation of an enhanced-quality draft genome by re-sequencing the plant pathogenic bacterium Pseudomonas syringae pv. phaseolicola 1448A (Pph 1448A), which has a published, closed genome sequence of 5.93 Mbp. We use a combination of Illumina paired-end and mate-pair sequencing, and surprisingly find that de novo assemblies with 100x paired-end coverage and mate-pair sequencing with as low as low as 2–5x coverage are substantially better than assemblies based on higher coverage. The rapid and low-cost generation of large numbers of enhanced-quality draft genome sequences will be of particular value for microbial diagnostics and biosecurity, which rely on precise discrimination of potentially dangerous clones from closely related benign strains.

Highlights

  • The rapid development and wide-spread adoption of nextgeneration genomic technology provides unprecedented ability to generate genomic data, and has dramatically increased our understanding of both the breadth and depth of biological diversity [1,2]

  • The Genomics Standards Consortium (GSC) and Human Microbiome Project Jumpstart Consortium designate a spectrum of genome sequence standards [3]: N Standard Draft: minimally or unfiltered data, from any number of different sequencing platforms, that are assembled into contigs

  • Genomic DNA Extraction Genomic DNAs were prepared from a modified protocol based on the Gentra Puregene Genomic DNA Purification Kit Instructions (Qiagen Canada, Toronto, ON), where all reagent volumes are doubled

Read more

Summary

Introduction

The rapid development and wide-spread adoption of nextgeneration (next-gen) genomic technology provides unprecedented ability to generate genomic data, and has dramatically increased our understanding of both the breadth and depth of biological diversity [1,2]. The nature of the technology has dramatically increased our reliance on draft, rather than finished, genome sequences. The Genomics Standards Consortium (GSC) and Human Microbiome Project Jumpstart Consortium designate a spectrum of genome sequence standards [3]:. N Standard Draft: minimally or unfiltered data, from any number of different sequencing platforms, that are assembled into contigs. This is the minimum standard for a submission to the public databases. Sequence of this quality will likely harbor many regions of poor quality and can be relatively incomplete. Standard Draft is the least expensive to produce and still possesses useful information

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call