Abstract

Complex metabolite mixtures are challenging to unravel. Mass spectrometry (MS) is a widely used and sensitive technique for obtaining structural information of complex mixtures. However, just knowing the molecular masses of the mixture's constituents is almost always insufficient for confident assignment of the associated chemical structures. Structural information can be augmented through MS fragmentation experiments whereby detected metabolites are fragmented, giving rise to MS/MS spectra. However, how can we maximize the structural information we gain from fragmentation spectra? We recently proposed a substructure-based strategy to enhance metabolite annotation for complex mixtures by considering metabolites as the sum of (bio)chemically relevant moieties that we can detect through mass spectrometry fragmentation approaches. Our MS2LDA tool allows us to discover - unsupervised - groups of mass fragments and/or neutral losses, termed Mass2Motifs, that often correspond to substructures. After manual annotation, these Mass2Motifs can be used in subsequent MS2LDA analyses of new datasets, thereby providing structural annotations for many molecules that are not present in spectral databases. Here, we describe how additional strategies, taking advantage of (i) combinatorial in silico matching of experimental mass features to substructures of candidate molecules, and (ii) automated machine learning classification of molecules, can facilitate semi-automated annotation of substructures. We show how our approach accelerates the Mass2Motif annotation process and therefore broadens the chemical space spanned by characterized motifs. Our machine learning model used to classify fragmentation spectra learns the relationships between fragment spectra and chemical features. Classification prediction on these features can be aggregated for all molecules that contribute to a particular Mass2Motif and guide Mass2Motif annotations. To make annotated Mass2Motifs available to the community, we also present MotifDB: an open database of Mass2Motifs that can be browsed and accessed programmatically through an Application Programming Interface (API). MotifDB is integrated within ms2lda.org, allowing users to efficiently search for characterized motifs in their own experiments. We expect that with an increasing number of Mass2Motif annotations available through a growing database, we can more quickly gain insight into the constituents of complex mixtures. This will allow prioritization towards novel or unexpected chemistries and faster recognition of known biochemical building blocks.

Highlights

  • Complex natural mixtures are full of specialized metabolites with diverse structures and functions.[1]

  • Based on the above examples, we show how MAGMa annotations are very helpful during the Mass2Motif annotation process

  • We have described multiple extensions to the MS2LDA platform that enhance the ability of analysts to characterize the makeup of complex mixtures of metabolites

Read more

Summary

Introduction

Complex natural mixtures are full of specialized metabolites with diverse structures and functions.[1] In untargeted metabolomics approaches, these molecules give rise to information-rich mass spectral data sets and a key challenge is the interpretation of this data, in terms of identifying chemical structures.[2,3] This process is commonly referred to as metabolite annotation and identi cation,[4] a highly challenging process that typically enables the assignment of chemical structures to only a very small percentage of the molecules detected.[2,5,6,7] the rapid and automated identi cation of chemical structures is one of the main obstacles hindering the discovery of novel bioactive molecules addressing global health care threats, such as antimicrobial resistance, cancer or in ammatory diseases. The structural annotation of Mass2Motifs is currently performed via a combination of manual peak searching in MS/MS

Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.