Abstract

Genome mining has become an increasingly powerful, scalable, and economically accessible tool for the study of natural product biosynthesis and drug discovery. However, there remain important biological and practical problems that can complicate or obscure biosynthetic analysis in genomic and metagenomic sequencing projects. Here, we focus on limitations of available technology as well as computational and experimental strategies to overcome them. We review the unique challenges and approaches in the study of symbiotic and uncultured systems, as well as those associated with biosynthetic gene cluster (BGC) assembly and product prediction. Finally, to explore sequencing parameters that affect the recovery and contiguity of large and repetitive BGCs assembled de novo, we simulate Illumina and PacBio sequencing of the Salinispora tropica genome focusing on assembly of the salinilactam (slm) BGC.

Highlights

  • As sequencing costs continue to decrease [1], it is more feasible than ever to sequence the genome of natural product producing organisms

  • We were only able to resolve the pathway into a single contiguous sequence using 30× coverage with PacBio sequencing, which is not practically feasible for most metagenomic applications, due to cost and difficulties involved in obtaining DNA of high enough quality

  • Binning methods used to investigate biosynthetic gene cluster (BGC) have a number of notable strengths and weaknesses

Read more

Summary

Introduction

As sequencing costs continue to decrease [1], it is more feasible than ever to sequence the genome of natural product producing organisms. Drugs 2017, 15, 165 a compound, since genes involved in resistance mechanisms are often clustered with natural product biosynthetic genes [6,7] Another motivation for sequencing pathways is to establish a renewable supply of the compound of interest, either through engineering of the producing organism [8,9], or by heterologous expression [10]. In the case of metagenomics projects, the genomes of other co-localized species often complicate or obscure the specific pathway of interest This information can tie primary [11] and secondary metabolic pathways [12,13] to a specific organism allowing one to investigate the producing organism’s ecology and/or evolutionary history [14]. We discuss some biological, evolutionary, and ecological factors warranting consideration throughout the course of sequencing projects

Evolution of Biosynthetic Pathways
Pathways from Symbiotic and Uncultured Sources
Capabilities and Limitations of Current Sequencing Technologies
Metagenomic Binning and Practical Considerations for BGC Analysis
Successful Examples of Repetitive BGCs Analyzed by De Novo Assembly
Findings
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call