Abstract

Microorganisms produce secondary metabolites with a remarkable range of bioactive properties. The constantly increasing amount of published genomic data provides the opportunity for efficient identification of biosynthetic gene clusters by genome mining. On the other hand, for many natural products with resolved structures, the encoding biosynthetic gene clusters have not been identified yet. Of those secondary metabolites, the scaffolds of nonribosomal peptides and polyketides (type I modular) can be predicted due to their building block-like assembly. SeMPI v2 provides a comprehensive prediction pipeline, which includes the screening of the scaffold in publicly available natural compound databases. The screening algorithm was designed to detect homologous structures even for partial, incomplete clusters. The pipeline allows linking of gene clusters to known natural products and therefore also provides a metric to estimate the novelty of the cluster if a matching scaffold cannot be found. Whereas currently available tools attempt to provide comprehensive information about a wide range of gene clusters, SeMPI v2 aims to focus on precise predictions. Therefore, the cluster detection algorithm, including building block generation and domain substrate prediction, was thoroughly refined and benchmarked, to provide high-quality scaffold predictions. In a benchmark based on 559 gene clusters, SeMPI v2 achieved comparable or better results than antiSMASH v5. Additionally, the SeMPI v2 web server provides features that can help to further investigate a submitted gene cluster, such as the incorporation of a genome browser, and the possibility to modify a predicted scaffold in a workbench before the database screening.

Highlights

  • Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.license.Microorganisms, such as bacteria and fungi, have always been subject to evolutionary pressure

  • biosynthetic gene clusters (BGCs) detection relies on the correct identification of domains involved in secondary metabolites (SMs) biosynthesis

  • The ACP and bACP domains are wrongly classified in some cases, but since the role of ACP and bACP is identical for the SeMPI v2 module detection algorithm, the misclassification does not pose a problem for the overall scaffold generation

Read more

Summary

Introduction

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. Microorganisms, such as bacteria and fungi, have always been subject to evolutionary pressure. At the level of biosynthesis, microorganisms “learned” to produce a vast number of natural products that help them to survive [1]. These secondary metabolites (SMs) often possess biological activities, which can be exploited for pharmaceutical purposes. The molecular machinery for the production of SMs is encoded in gene assemblies organized in biosynthetic gene clusters (BGCs). For most known SMs, the encoding BGC has not yet been discovered

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call