Abstract

This article describes PhenoMeter (PM), a new type of metabolomics database search that accepts metabolite response patterns as queries and searches the MetaPhen database of reference patterns for responses that are statistically significantly similar or inverse for the purposes of detecting functional links. To identify a similarity measure that would detect functional links as reliably as possible, we compared the performance of four statistics in correctly top-matching metabolic phenotypes of Arabidopsis thaliana metabolism mutants affected in different steps of the photorespiration metabolic pathway to reference phenotypes of mutants affected in the same enzymes by independent mutations. The best performing statistic, the PM score, was a function of both Pearson correlation and Fisher’s Exact Test of directional overlap. This statistic outperformed Pearson correlation, biweight midcorrelation and Fisher’s Exact Test used alone. To demonstrate general applicability, we show that the PM reliably retrieved the most closely functionally linked response in the database when queried with responses to a wide variety of environmental and genetic perturbations. Attempts to match metabolic phenotypes between independent studies were met with varying success and possible reasons for this are discussed. Overall, our results suggest that integration of pattern-based search tools into metabolomics databases will aid functional annotation of newly recorded metabolic phenotypes analogously to the way sequence similarity search algorithms have aided the functional annotation of genes and proteins. PM is freely available at MetabolomeExpress (https://www.metabolome-express.org/phenometer.php).

Highlights

  • The emergence of high throughput omics technologies has driven an explosive increase in the global rate of biological data generation

  • The earliest Golm Metabolome Database (GMD) versions focused on sharing reference mass-spectral and retention index (MSRI) reference libraries for gas chromatography/mass spectrometry (GC/MS) peak identification, providing basic browse and text search options

  • After demonstrating the high performance of the adopted scoring method, we describe the permutation-based approach used for significance testing and show that results of typical queries are returned in practical timeframes

Read more

Summary

Introduction

The emergence of high throughput omics technologies has driven an explosive increase in the global rate of biological data generation. As the challenges of capture, storage, and exchange are overcome in new fields, we anticipate increased efforts to make accumulated data more immediately useful to biologists by equipping databases with advanced analytical capabilities extending beyond simple search, browse, and visualization functions. We describe one such effort in the field of metabolomics to turn a database of experimentally observed metabolite responses into an analytical tool by equipping it with metabolite response pattern-based search functionality. Subsequent versions added value to these libraries by introducing new tools for MSRI data-based search with chemical sub-structure prediction so users can search their own spectra against the collection of reference spectra to identify their peaks and, in the case of unmatched spectra, gain clues about possible chemical structures (Hummel et al, 2010)

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call