Abstract

Library matching using carbon-13 nuclear magnetic resonance (13C NMR) spectra has been a popular method adopted in compound identification systems. However, the usability of existing approaches has been restricted as enlarging a library containing both a chemical structure and spectrum is a costly and time-consuming process. Therefore, we propose a fundamentally different, novel approach to match 13C NMR spectra directly against a molecular structure library. We develop a cross-modal retrieval between spectrum and structure (CReSS) system using deep contrastive learning, which allows us to search a molecular structure library using the 13C NMR spectrum of a compound. In the test of searching 41,494 13C NMR spectra against a reference structure library containing 10.4 million compounds, CReSS reached a recall@10 accuracy of 91.64% and a processing speed of 0.114 s per query spectrum. When further incorporating a filter with a molecular weight tolerance of 5 Da, CReSS achieved a new remarkable recall@10 of 98.39%. Furthermore, CReSS has potential in detecting scaffolds of novel structures and demonstrates great performance for the task of structural revision. CReSS is built and developed to bridge the gap between 13C NMR spectra and structures and could be generally applicable in compound identification.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call