Abstract
Genome mining has become an increasingly powerful, scalable, and economically accessible tool for the study of natural product biosynthesis and drug discovery. However, there remain important biological and practical problems that can complicate or obscure biosynthetic analysis in genomic and metagenomic sequencing projects. Here, we focus on limitations of available technology as well as computational and experimental strategies to overcome them. We review the unique challenges and approaches in the study of symbiotic and uncultured systems, as well as those associated with biosynthetic gene cluster (BGC) assembly and product prediction. Finally, to explore sequencing parameters that affect the recovery and contiguity of large and repetitive BGCs assembled de novo, we simulate Illumina and PacBio sequencing of the Salinispora tropica genome focusing on assembly of the salinilactam (slm) BGC.
Highlights
As sequencing costs continue to decrease [1], it is more feasible than ever to sequence the genome of natural product producing organisms
We were only able to resolve the pathway into a single contiguous sequence using 30× coverage with PacBio sequencing, which is not practically feasible for most metagenomic applications, due to cost and difficulties involved in obtaining DNA of high enough quality
Binning methods used to investigate biosynthetic gene cluster (BGC) have a number of notable strengths and weaknesses
Summary
As sequencing costs continue to decrease [1], it is more feasible than ever to sequence the genome of natural product producing organisms. Drugs 2017, 15, 165 a compound, since genes involved in resistance mechanisms are often clustered with natural product biosynthetic genes [6,7] Another motivation for sequencing pathways is to establish a renewable supply of the compound of interest, either through engineering of the producing organism [8,9], or by heterologous expression [10]. In the case of metagenomics projects, the genomes of other co-localized species often complicate or obscure the specific pathway of interest This information can tie primary [11] and secondary metabolic pathways [12,13] to a specific organism allowing one to investigate the producing organism’s ecology and/or evolutionary history [14]. We discuss some biological, evolutionary, and ecological factors warranting consideration throughout the course of sequencing projects
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have