Propagating annotations of molecular networks using in silico fragmentation.

Ricardo R Da Silva,Louis-Félix Nothias,Marcy J Balunas,Andrés Mauricio Caraballo-Rodríguez,Norberto Peporine Lopes,Pieter C Dorrestein,Evan Fox,Mingxun Wang,Justin J J Van Der Hooft,Jonathan L Klassen,Avner Schlessinger

doi:10.1371/journal.pcbi.1006089

Ricardo R Da Silva, Louis-Félix Nothias + Show 9 more

Open Access

https://doi.org/10.1371/journal.pcbi.1006089

Copy DOI

Journal: PLoS computational biology	Publication Date: Apr 18, 2018
Citations: 243	License type: CC BY 4.0

Abstract

The annotation of small molecules is one of the most challenging and important steps in untargeted mass spectrometry analysis, as most of our biological interpretations rely on structural annotations. Molecular networking has emerged as a structured way to organize and mine data from untargeted tandem mass spectrometry (MS/MS) experiments and has been widely applied to propagate annotations. However, propagation is done through manual inspection of MS/MS spectra connected in the spectral networks and is only possible when a reference library spectrum is available. One of the alternative approaches used to annotate an unknown fragmentation mass spectrum is through the use of in silico predictions. One of the challenges of in silico annotation is the uncertainty around the correct structure among the predicted candidate lists. Here we show how molecular networking can be used to improve the accuracy of in silico predictions through propagation of structural annotations, even when there is no match to a MS/MS spectrum in spectral libraries. This is accomplished through creating a network consensus of re-ranked structural candidates using the molecular network topology and structural similarity to improve in silico annotations. The Network Annotation Propagation (NAP) tool is accessible through the GNPS web-platform https://gnps.ucsd.edu/ProteoSAFe/static/gnps-theoretical.jsp.

Highlights

One way to gain insight into the molecules of a biological sample is through mass spectrometry
The corresponding Network Annotation Propagation (NAP) jobs can be accessed through the web interface with the following job IDs: for NIST library - http:// proteomics2.ucsd.edu/ProteoSAFe/status.jsp? task=29d517e67067476bae97a32f2d4977e0, http://proteomics2.ucsd.edu/ProteoSAFe/status. jsp?task=d270e79876cb48deb6aabd52a4fc647e, http://proteomics2.ucsd.edu/ProteoSAFe/status. jsp?task=e2125577fe2646129becc248b96d42ba, http://proteomics2.ucsd.edu/ProteoSAFe/status. jsp?task=81e01fe178d3424686079903d908b536, http://proteomics2.ucsd.edu/ProteoSAFe/status. jsp?task=daa546b038604e5f83eaafb811bd0313, http://proteomics2.ucsd.edu/ProteoSAFe/status. jsp?task=61c8a0d01309408f8ecceb5b31dab1a8, http://proteomics2.ucsd.edu/ProteoSAFe/status. jsp?task=60fe9f77b3d04789997bf19aa1a0a828, http://proteomics2.ucsd.edu/ProteoSAFe/status. jsp?task=53f8494ff9e8423697eebf4e98d287f0, http://proteomics2.ucsd.edu/ProteoSAFe/status. jsp?task=c93a840100ec49bdbb3c12e5ed1e4790, data only cover a small portion of the known molecular space
Molecular relationships, based on spectral similarity, can be used to enhance the structural hypothesis inferred from the annotation of molecules detected by mass spectrometry

Summary

Introduction

One way to gain insight into the molecules of a biological sample is through mass spectrometry. In an untargeted mass spectrometry experiment, we do not set the mass spectrometer to weigh specific molecules only, instead, we have the potential to observe hundreds to thousands of ions from a single sample; but most experiments report only on one or a few dozen molecules and often within the limit of known pathways described in textbooks [2]. Such pathways represent only a fraction of molecules that are detected. Through matching fragmented spectra with reference libraries, we can annotate 2% (average of spectral library matching in all GNPS datasets) of the data [3], for well studied biological matrices, such as Escherichia coli, human cell lines, plasma or urine this may be as high as 10% [4]

Methods

Results

Discussion

Conclusion