Abstract
Background noise in metagenomic studies is often of high importance and its removal requires extensive post-analytic, bioinformatics filtering. This is relevant as significant signals may be lost due to a low signal-to-noise ratio. The presence of plasmid residues, that are frequently present in reagents as contaminants, has not been investigated so far, but may pose a substantial bias. Here we show that plasmid sequences from different sources are omnipresent in molecular biology reagents. Using a metagenomic approach, we identified the presence of the (pol) of equine infectious anemia virus in human samples and traced it back to the expression plasmid used for generation of a commercial reverse transcriptase. We found fragments of multiple other expression plasmids in human samples as well as commercial polymerase preparations. Plasmid contamination sources included production chain of molecular biology reagents as well as contamination of reagents from environment or human handling of samples and reagents. Retrospective analyses of published metagenomic studies revealed an inaccurate signal-to-noise differentiation. Hence, the plasmid sequences that seem to be omnipresent in molecular biology reagents may misguide conclusions derived from genomic/metagenomics datasets and thus also clinical interpretations. Critical appraisal of metagenomic data sets for the possibility of plasmid background noise is required to identify reliable and significant signals.
Highlights
Metagenomics dramatically changed our view on the composition of microbial communities in a diversity of ecosystems, including the gut associated microbiome
Equine Infectious anemia virus pol sequences are derived from extrinsic plasmids
A phylogenetic analysis of the sequences found in relation to those of other lentiviridae such as Human Immunodeficiency Virus-1 pol (HIV-1; NC_001802.1), Feline Immunodeficiency Virus pol (FIV; NC_001482.1) and Maedi/Visna pol strain kv1772 (NC_001452.1) showed a high similarity of the sequences detected with the pol gene of the Equine Infectious Anemia Virus (EIAV) clone CL 22 strain (ID: M87581.1; Fig. 1)
Summary
Metagenomics dramatically changed our view on the composition of microbial communities in a diversity of ecosystems, including the gut associated microbiome. Genomic sequences that may have been inadvertently introduced into samples during processing will be sequenced at a similar efficacy as target sequences and this background noise may mask signals obtained from target sequences. This is a common contamination problem, as exemplified in studies that used Whole Genome Amplification (WGA), where as few as 30% of all reads originated from target DNA1. An important but widely ignored source of foreign genomic sequences are enzyme preparations used for NGS. Enzymes such as polymerases are generated recombinantly in prokaryotic hosts with the usage of an inducible expression vectors. Viruses in general, are very specific for their hosts and cross-species infections are rare events
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.