Abstract

MotivationImaging mass spectrometry (imaging MS) is a prominent technique for capturing distributions of molecules in tissue sections. Various computational methods for imaging MS rely on quantifying spatial correlations between ion images, referred to as co-localization. However, no comprehensive evaluation of co-localization measures has ever been performed; this leads to arbitrary choices and hinders method development.ResultsWe present ColocML, a machine learning approach addressing this gap. With the help of 42 imaging MS experts from nine laboratories, we created a gold standard of 2210 pairs of ion images ranked by their co-localization. We evaluated existing co-localization measures and developed novel measures using term frequency–inverse document frequency and deep neural networks. The semi-supervised deep learning Pi model and the cosine score applied after median thresholding performed the best (Spearman 0.797 and 0.794 with expert rankings, respectively). We illustrate these measures by inferring co-localization properties of 10 273 molecules from 3685 public METASPACE datasets.Availability and implementation https://github.com/metaspace2020/coloc.Supplementary information Supplementary data are available at Bioinformatics online.

Highlights

  • In the past two decades, a window of opportunity has been opened by the development and further maturation of imaging mass spectrometry, a powerful and versatile technology for spatial molecular analysis (Buchberger et al, 2018; Doerr, 2018; Dreisewerd and Yew, 2017) with a particular interest in clinical (Vaysse et al, 2017) and pharmaceutical applications (Schulz et al, 2019)

  • Our work provides a gold standard set [available at GitHub] which can be used for evaluating future measures, and illustrates how using openaccess data, web technologies, community engagement and deep learning open novel avenues to addressing long-standing challenges in imaging MS

  • 2.7.3 Unsupervised UMAP We developed a model based on the uniform manifold approximation and projection (UMAP), a recently developed non-linear dimensionality reduction technique with broad applications in biology (McInnes et al, 2018)

Read more

Summary

Introduction

Metabolites and lipids play key roles in fuelling and making up cells, determining their types and states. Rapid development and growing popularity of imaging MS, as well as the high dimensionality and sheer size of generated data, measuring up to hundreds of gigabytes for a tissue section, have stimulated the development of computational methods and software (Alexandrov, 2012). Various methods have been developed for lowdimensional data representation (based on PCA, NMF, t-SNE, biclustering), finding spatial regions of interest with spatial segmentation, search for markers associated with a region of interest, and, recently, for metabolite annotation (Palmer et al, 2017). Many of these methods use some measure of spatial similarity between ion images, often referred to as spatial co-localization.

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call