Abstract

Unbiased high-throughput sequencing of whole metagenome shotgun DNA libraries is a promising new approach to identifying microbes in clinical specimens, which, unlike other techniques, is not limited to known sequences. Unlike most sequencing applications, it is highly sensitive to laboratory contaminants as these will appear to originate from the clinical specimens. To assess the extent and diversity of sequence contaminants, we aligned 57 “1000 Genomes Project” sequencing runs from six centers against the four largest NCBI BLAST databases, detecting reads of diverse contaminant species in all runs and identifying the most common of these contaminant genera (Bradyrhizobium) in assembled genomes from the NCBI Genome database. Many of these microorganisms have been reported as contaminants of ultrapure water systems. Studies aiming to identify novel microbes in clinical specimens will greatly benefit from not only preventive measures such as extensive UV irradiation of water and cross-validation using independent techniques, but also a concerted effort to sequence the complete genomes of common contaminants so that they may be subtracted computationally.

Highlights

  • Systematic pathogen discovery based on unbiased highthroughput sequencing [1] was first used in 2008 to detect two novel viruses by pyrosequencing clinical specimens

  • Unbiased high-throughput sequencing has been suggested as a way of detecting etiological microbes in cancer tissue [6], an approach we consider promising for prostate cancer [7,8]. To facilitate such studies we constructed the Leif Microbiome Analyzer, a bioinformatics tool similar to PathSeq which was designed to eliminate the need for cluster computing typically required by NCBI blastn based tools such as PathSeq–even when aligning against the largest NCBI BLAST databases

  • The presence of significant levels of Bradyrhizobium genus sequence in our two clinical samples led us to examine, as a negative control, reads from two human 1000 Genomes Project runs which did not align to the human genome

Read more

Summary

Introduction

Systematic pathogen discovery based on unbiased highthroughput sequencing [1] was first used in 2008 to detect two novel viruses by pyrosequencing clinical specimens. Unbiased high-throughput sequencing has been suggested as a way of detecting etiological microbes in cancer tissue [6], an approach we consider promising for prostate cancer [7,8] To facilitate such studies we constructed the Leif Microbiome Analyzer, a bioinformatics tool similar to PathSeq which was designed to eliminate the need for cluster computing typically required by NCBI blastn based tools such as PathSeq–even when aligning against the largest NCBI BLAST databases. While testing the Leif Microbiome Analyzer by examining two clinical samples for reads not aligning to the human genome, we encountered many reads from diverse species not known to be part of the human microbiome, suggesting the presence of contamination; members of the Bradyrhizobium genus were prominent If this situation arises commonly, it would be cost-prohibitive to screen candidate non-aligning reads using polymerase chain reaction (PCR) on the original specimens. In the case of novel contaminant microbes whose genome has not been completely sequenced, the 1000 confirmatory PCR reaction figure cannot be reduced by sampling only some reads of each species, as it is not possible to know which reads arose from the same species

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call