Abstract

BackgroundMassively parallel sequencing technology is revolutionizing approaches to genomic and genetic research. Since its advent, the scale and efficiency of Next-Generation Sequencing (NGS) has rapidly improved. In spite of this success, sequencing genomes or genomic regions with extremely biased base composition is still a great challenge to the currently available NGS platforms. The genomes of some important pathogenic organisms like Plasmodium falciparum (high AT content) and Mycobacterium tuberculosis (high GC content) display extremes of base composition. The standard library preparation procedures that employ PCR amplification have been shown to cause uneven read coverage particularly across AT and GC rich regions, leading to problems in genome assembly and variation analyses. Alternative library-preparation approaches that omit PCR amplification require large quantities of starting material and hence are not suitable for small amounts of DNA/RNA such as those from clinical isolates. We have developed and optimized library-preparation procedures suitable for low quantity starting material and tolerant to extremely high AT content sequences.ResultsWe have used our optimized conditions in parallel with standard methods to prepare Illumina sequencing libraries from a non-clinical and a clinical isolate (containing ~53% host contamination). By analyzing and comparing the quality of sequence data generated, we show that our optimized conditions that involve a PCR additive (TMAC), produces amplified libraries with improved coverage of extremely AT-rich regions and reduced bias toward GC neutral templates.ConclusionWe have developed a robust and optimized Next-Generation Sequencing library amplification method suitable for extremely AT-rich genomes. The new amplification conditions significantly reduce bias and retain the complexity of either extremes of base composition. This development will greatly benefit sequencing clinical samples that often require amplification due to low mass of DNA starting material.

Highlights

  • Parallel sequencing technology is revolutionizing approaches to genomic and genetic research

  • Much work has focused on solving amplification and sequencing problems associated with high GC content but nothing has been done to improve on those caused by AT-rich parts of the genome [4,5]

  • Two sets of Polymerase chain reaction (PCR) were performed, one using reagents and conditions provided by the manufacturer and the other deviating only by inclusion of 60 mM tetramethylammonium chloride (TMAC) in the reaction mixture

Read more

Summary

Introduction

Parallel sequencing technology is revolutionizing approaches to genomic and genetic research. The scale and efficiency of Next-Generation Sequencing (NGS) has rapidly improved In spite of this success, sequencing genomes or genomic regions with extremely biased base composition is still a great challenge to the currently available NGS platforms. Re-sequencing of these samples has faced two major challenges: low mass of total parasite DNA and extremely high AT-base composition of the genome. These problems pose major technical challenges predominantly in the library preparation stage of the NGS pipeline and in subsequent data analysis. Polymerase chain reaction (PCR) amplification conditions as currently used in the standard library preparation procedures, have been shown to introduce biases in sequence coverage towards DNA regions with balanced base composition[5,6]. These PCR-introduced artifacts have been shown to cause misleading or inaccurate conclusions in the analysis of genome-variation data [4]

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call