Abstract

A pivotal challenge in metabolite research is the structural annotation of metabolites from tandem mass spectrometry (MS/MS) data. The integration of artificial intelligence (AI) has revolutionized the interpretation of MS data, facilitating the identification of elusive metabolites within the metabolomics landscape. Innovative methodologies are primarily focusing on transforming MS/MS spectra or molecular structures into a unified modality to enable similarity-based comparison and interpretation. In this work, we present CMSSP, a novel Contrastive Mass Spectra-Structure Pretraining framework designed for metabolite annotation. The primary objective of CMSSP is to establish a representation space that facilitates a direct comparison between MS/MS spectra and molecular structures, transcending the limitations of distinct modalities. The evaluation on two benchmark test sets demonstrates the efficacy of the approach. CMSSP achieved a remarkable enhancement in annotation accuracy, outperforming the state-of-the-art methods by a significant margin. Specifically, it improved the top-1 accuracy by 30% on the CASMI 2017 data set and realized a 16% increase in top-10 accuracy on an independent test set. Moreover, the model displayed superior identification accuracy across all seven chemical categories, showcasing its robustness and versatility. Finally, the MS/MS data of 30 metabolites from Glycyrrhiza glabra were analyzed, achieving top-1 and top-3 accuracies of 86.7 and 100%, respectively. The CMSSP model serves as a potent tool for the dissection and interpretation of intricate MS/MS data, propelling the field toward more accurate and efficient metabolite annotation. This not only augments the analytical capabilities of metabolomics but also paves the way for future discoveries in understanding of complex biological systems.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.