Abstract

The identification of chemical structures in natural product mixtures is an important task in drug discovery but is still a challenging problem, as structural elucidation is a time-consuming process and is limited by the available mass spectra of known natural products. Computer-aided structure elucidation (CASE) strategies seek to automatically propose a list of possible chemical structures in mixtures by utilizing chromatographic and spectroscopic methods. However, current CASE tools still cannot automatically solve structures for experienced natural product chemists. Here, we formulated the structural elucidation of natural products in a mixture as a computational problem by extending a list of scaffolds using a weighted side chain list after analyzing a collection of 243,130 natural products and designed an efficient algorithm to precisely identify the chemical structures. The complexity of such a problem is NP-complete. A dynamic programming (DP) algorithm can solve this NP-complete problem in pseudo-polynomial time after converting floating point molecular weights into integers. However, the running time of the DP algorithm degrades exponentially as the precision of the mass spectrometry experiment grows. To ideally solve in polynomial time, we proposed a novel iterative DP algorithm that can quickly recognize the chemical structures of natural products. By utilizing this algorithm to elucidate the structures of four natural products that were experimentally and structurally determined, the algorithm can search the exact solutions, and the time performance was shown to be in polynomial time for average cases. The proposed method improved the speed of the structural elucidation of natural products and helped broaden the spectrum of available compounds that could be applied as new drug candidates. A web service built for structural elucidation studies is freely accessible via the following link (http://csccp.cmdm.tw/).

Highlights

  • Examining natural and therapeutic products is crucial for drug development because many chemically synthesized compounds have potentially serious toxicity and adverse effects, while less toxic compounds extracted from natural products could possibly be developed into new drug candidates [1]

  • For the structural elucidation of complex natural products, we defined a new Chemical Substituents-Core Combinatorial Problem (CSCCP) problem based only on the information obtained from mass spectrometry (MS) spectra

  • To solve the CSCCP, exponential time should be required in the worst-case scenario

Read more

Summary

Introduction

Examining natural and therapeutic products is crucial for drug development because many chemically synthesized compounds have potentially serious toxicity and adverse effects, while less toxic compounds extracted from natural products could possibly be developed into new drug candidates [1]. Because the magnitude of the natural products database is limited, high-throughput screening methods cannot be used to effectively identify potential natural products drugs. High-resolution and high-dimensional NMR methods have undergone continual advancement [10,11,12], NMR still cannot independently elucidate novel chemical structures unless co-eluting compounds can be completely separated [8]. Even though LC–NMR– MS [13] and LC–UV–solid-phase extraction–NMR–MS [9] have proven to be effective methods to elucidate compound structures in natural products extracts, the successful structural elucidation of unknown compounds still greatly depends on the development of computational systems to help evaluate the mass spectral data [14]

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call