Abstract

BackgroundClone libraries provide researchers with a powerful resource to study nucleic acid from diverse sources. Metagenomic clone libraries in particular have aided in studies of microbial biodiversity and function, and allowed the mining of novel enzymes. Libraries are often constructed by cloning large inserts into cosmid or fosmid vectors. Recently, there have been reports of GC bias in fosmid metagenomic libraries, and it was speculated to be a result of fragmentation and loss of AT-rich sequences during cloning. However, evidence in the literature suggests that transcriptional activity or gene product toxicity may play a role.ResultsTo explore possible mechanisms responsible for sequence bias in clone libraries, we constructed a cosmid library from a human microbiome sample and sequenced DNA from different steps during library construction: crude extract DNA, size-selected DNA, and cosmid library DNA. We confirmed a GC bias in the final cosmid library, and we provide evidence that the bias is not due to fragmentation and loss of AT-rich sequences but is likely occurring after DNA is introduced into Escherichia coli. To investigate the influence of strong constitutive transcription, we searched the sequence data for promoters and found that rpoD/σ70 promoter sequences were underrepresented in the cosmid library. Furthermore, when we examined the genomes of taxa that were differentially abundant in the cosmid library relative to the original sample, we found the bias to be more correlated with the number of rpoD/σ70 consensus sequences in the genome than with simple GC content.ConclusionsThe GC bias of metagenomic libraries does not appear to be due to DNA fragmentation. Rather, analysis of promoter sequences provides support for the hypothesis that strong constitutive transcription from sequences recognized as rpoD/σ70 consensus-like in E. coli may lead to instability, causing loss of the plasmid or loss of the insert DNA that gives rise to the transcription. Despite widespread use of E. coli to propagate foreign DNA in metagenomic libraries, the effects of in vivo transcriptional activity on clone stability are not well understood. Further work is required to tease apart the effects of transcription from those of gene product toxicity.Electronic supplementary materialThe online version of this article (doi:10.1186/s40168-015-0086-5) contains supplementary material, which is available to authorized users.

Highlights

  • Clone libraries provide researchers with a powerful resource to study nucleic acid from diverse sources

  • Our results show that while library bias only generally correlates with GC content, library bias correlates surprisingly well with the rpoD consensus content of the genome. These results suggest that GC content may be only a rough proxy for rpoD consensus content, but GC content itself may not be an accurate predictor of library presence/ abundance; in some cases, a genome may have a moderate or relatively high percent GC and possess an unusually high rpoD consensus content, leading to an underrepresentation in the cosmid library that could not Examining the published literature: evidence for transcriptional activity of cloned AT-rich DNA interfering with stability of circular vectors In this report, we have presented analysis concerning metagenomic DNA

  • The results presented in this report and what was already known from the literature together support the hypothesis that GC bias in typical clone libraries is related to constitutive promoter activity of the insert in E. coli, DNA topology as well as toxic protein effects may influence insert and plasmid maintenance

Read more

Summary

Introduction

Clone libraries provide researchers with a powerful resource to study nucleic acid from diverse sources. There have been reports of GC bias in fosmid metagenomic libraries, and it was speculated to be a result of fragmentation and loss of AT-rich sequences during cloning. It has been previously observed that fosmid libraries exhibit a GC bias [1, 2]. Such cloning biases may affect conclusions derived from analysis of the clone libraries. The observed GC bias of fosmid libraries was suggested to be due to fragmentation and subsequent loss of AT-rich sequences during the cloning process, purportedly because AT-rich sequences have fewer hydrogen bonds which makes them more vulnerable to non-perpendicular shear forces [1]. It seemed to us that the suggestion by Temperton et al [1] that the GC bias in cosmid/fosmid libraries might be due to fragmentation of AT-rich sequences was unlikely to be true; rather, we believe that events occurring in vivo may be contributing substantially to the sequence bias of libraries

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call