Abstract
BackgroundNext-generation sequencing (NGS) technology has paved the way for rapid and cost-efficient de novo sequencing of bacterial genomes. In particular, the introduction of PCR-free library preparation procedures (LPPs) lead to major improvements as PCR bias is largely reduced. However, in order to facilitate the assembly of Illumina paired-end sequence data and to enhance assembly performance, an increase of insert sizes to facilitate the repeat bridging and resolution capabilities of current state of the art assembly tools is needed. In addition, information concerning the relationships between genomic GC content, library insert size and sequencing quality as well as the influence of library insert size, read length and sequencing depth on assembly performance would be helpful to specifically target sequencing projects.ResultsOptimized DNA fragmentation settings and fine-tuned resuspension buffer to bead buffer ratios during fragment size selection were integrated in the Illumina TruSeq® DNA PCR-free LPP in order to produce sequencing libraries varying in average insert size for bacterial genomes within a range of 35.4–73.0 % GC content. The modified protocol consumes only half of the reagents per sample, thus doubling the number of preparations possible with a kit. Examination of different libraries revealed that sequencing quality decreases with increased genomic GC content and with larger insert sizes. The estimation of assembly performance using assembly metrics like corrected NG50 and NGA50 showed that libraries with larger insert sizes can result in substantial assembly improvements as long as appropriate assembly tools are chosen. However, such improvements seem to be limited to genomes with a low to medium GC content. A positive trend between read length and assembly performance was observed while sequencing depth is less important, provided a minimum coverage is reached.ConclusionsBased on the optimized protocol developed, sequencing libraries with flexible insert sizes and lower reagent costs can be generated. Furthermore, increased knowledge about the interplay of sequencing quality, insert size, genomic GC content, read length, sequencing depth and the assembler used will help molecular biologists to set up an optimal experimental and analytical framework with respect to Illumina next-generation sequencing of bacterial genomes.Electronic supplementary materialThe online version of this article (doi:10.1186/s13104-016-2072-9) contains supplementary material, which is available to authorized users.
Highlights
Next-generation sequencing (NGS) technology has paved the way for rapid and cost-efficient de novo sequencing of bacterial genomes
The largest average insert size of a library prepared with the standard Illumina protocol lies within a range of ~550–650 bps
Adjusted shearing settings during DNA fragmentation and optimized resuspension buffer (RB) to bead buffer (BB) ratios during fragment size selection were applied to create sequencing libraries varying in average insert size
Summary
Next-generation sequencing (NGS) technology has paved the way for rapid and cost-efficient de novo sequencing of bacterial genomes. The introduction of PCR-free library preparation procedures (LPPs) lead to major improvements as PCR bias is largely reduced. Many efforts have been undertaken to improve existing library preparation procedures (LPPs) for paired-end genome sequencing [4,5,6,7]. The commercially offered Illumina TruSeq® DNA PCR-free LPP represents one of the most widely used solutions for the generation of paired-end genome sequencing libraries. It includes genomic DNA shearing by adaptive focused acoustics, which leads to random fragmentation of DNA in contrast to the more directed fragmentation via enzymatic digestion. The main application will be bacterial strains growing well under laboratory conditions
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.