Abstract

Functional (meta) genomics allows the high-throughput identification of functional genes in a premise-free way. However, it is still difficult to perform Sanger sequencing for high GC DNA templates, which hinders the functional genomic exploration of a high GC genomic library. Here, we developed a procedure to resolve this problem by coupling the Sanger and PacBio sequencing strategies. Identification of cadmium (Cd) resistance genes from a small-insert high GC genomic library was performed to test the procedure. The library was generated from a high GC (75.35%) bacterial genome. Nineteen clones that conferred Cd resistance to Escherichia coli subject to Sanger sequencing directly. The positive clones were in parallel subject to in vivo amplification in host cells, from which recombinant plasmids were extracted and linearized by selected restriction endonucleases. PacBio sequencing was performed to obtain the full-length sequences. As the identities, partial sequences from Sanger sequencing were aligned to the full-length sequences from PacBio sequencing, which led to the identification of seven unique full-length sequences. The unique sequences were further aligned to the full genome sequence of the source strain. Functional screening showed that the identified positive clones were all able to improve Cd resistance of the host cells. The functional genomic procedure developed here couples the Sanger and PacBio sequencing methods and overcomes the difficulties in PCR approaches for high GC DNA. The procedure can be a promising option for the high-throughput sequencing of functional genomic libraries, and realize a cost-effective and time-efficient identification of the positive clones, particularly for high GC genetic materials.

Highlights

  • Base composition substantially impacts genome stability and evolution [1], and high-GC content is thought to be associated with high selective pressure [2]

  • The positive clones are subject to Sanger sequencing, and aliquots of them are in parallel subject to in vivo amplification in host cells

  • To verify the derivation of the sequences obtained from the functional genomics procedure, the complete genome of Cellulomonas sp. strain Y8 was sequenced by using the Illumina HiSeq (Illumina, San Diego, CA, USA) and PacBio RS II platforms (Pacific Biosciences, Menlo Park, CA, USA)

Read more

Summary

Introduction

Base composition substantially impacts genome stability and evolution [1], and high-GC content is thought to be associated with high selective pressure [2]. PacBio single molecule real-time (SMRT) sequencing can provide a PCR independent and efficient way to generate long reads with uniform coverage and high consensus accuracy via recognizing the fluorescent signal on single phospholinked nucleotides [17]. This procedure was applied to a small-insert genomic. DNA library of high GC content for the identification of Cd resistant genes This procedure overcomes the difficulties in PCR approaches for high GC gene templates and realizes a cost-effective and time-efficient identification of the positive clones

Experimental Design
The Strain and Culture Conditions
DNA Extraction
Full Genome Sequencing
Functional Genomic Screening
Sanger and PacBio Sequencing of Amplicons
Open Reading Frame Prediction and Annotation
Drop Assay
Results and Discussion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call