Abstract

Rediscovery of known natural products hinders the discovery of new, unique scaffolds. Efforts have mostly focused on streamlining the determination of what compounds are known vs. unknown (dereplication), but an alternative strategy is to focus on what is different. Utilizing statistics and assuming that common actinobacterial metabolites are likely known, focus can be shifted away from dereplication and towards discovery. LC-MS-based principal component analysis (PCA) provides a perfect tool to distinguish unique vs. common metabolites, but the variability inherent within natural products leads to datasets that do not fit ideal standards. To simplify the analysis of PCA models, we developed a script that identifies only those masses or molecules that are unique to each strain within a group, thereby greatly reducing the number of data points to be inspected manually. Since the script is written in R, it facilitates integration with other metabolomics workflows and supports automated mass matching to databases such as Antibase.

Highlights

  • While natural products have provided important drug leads, especially in the area of oncology and infectious disease, the abundance of known molecules plagues the discovery process

  • We and others have demonstrated that multivariate analysis of liquid chromatography-mass spectrometry (LC-MS) traces can be used effectively for strain prioritization and discovery [15,16,17,18,19,20,21,22,23,24]

  • We have demonstrated that finding so-called “outliers” using principal component analysis (PCA) has been an effective route for the discovery of novel natural products [2,14,25,26]

Read more

Summary

Introduction

While natural products have provided important drug leads, especially in the area of oncology and infectious disease, the abundance of known molecules plagues the discovery process. Methods that can identify molecules unique to particular strains offer a route to reduce the number of m/z values that need to be inspected as potential novel natural products. Combined with our observations that identifying “unique” molecules within related groups of actinomycetes leads to the discovery of novel chemotypes, there is a commonality among observations that suggests that looking only at molecules unique to strains would be an advantageous path for quickly identifying putative novel natural products. By focusing on the differences between strains, it allows us to quickly look at the outlying data for unique molecules This facilitates easier dereplication and can be used to prioritize strains prior to extract production, for whole genome sequencing, or application of MSMS approaches such as Global. In addition to marine natural products, the script can be integrated into other metabolomics workflows such as plant or NMR metabolomics with minor modifications

Untargeted
A scores overview plot for strain
Dependence
Dependence of the Analysis on the Group
Bacterial Cultivation
Agar-based Media for LC-MS Profiling
Data Processing and PCA
PoPCAR
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call