Abstract

The abundance of high-dimensional measurements in the form of gene expression and mass spectroscopy calls for models to elucidate the underlying biological system. For widely studied organisms like yeast, it is possible to incorporate prior knowledge from a variety of databases, an approach used in several recent studies. However if such information is not available for a particular organism these methods fall short. In this paper we propose a statistical method that is applicable to a dataset consisting of Liquid Chromatography-Mass Spectroscopy (LC-MS) and gene expression (DNA microarray) measurements from the same samples, to identify genes controlling the production of metabolites. Due to the high dimensionality of both LC-MS and DNA microarray data, dimension reduction and variable selection are key elements of the analysis. Our proposed approach starts by identifying the basis functions (“building blocks”) that constitute the output from a mass spectrometry experiment. Subsequently, the weights of these basis functions are related to the observations from the corresponding gene expression data in order to identify which genes are associated with specific patterns seen in the metabolite data. The modeling framework is extremely flexible as well as computationally fast and can accommodate treatment effects and other variables related to the experimental design. We demonstrate that within the proposed framework, genes regulating the production of specific metabolites can be identified correctly unless the variation in the noise is more than twice that of the signal.

Highlights

  • Metabolites are the products of cell metabolism and their functions are highly diverse

  • In one scenario we increased the number of genes, while the other investigated how the method performs if no associations between genes and peaks in the spectra are present

  • Each line in the figure represents a setup with a given number of peaks and the percentage of incorrectly classified genes are shown for shown for various signal-to-noise ratios (SNR) ranging form 52 to 0.1

Read more

Summary

Introduction

Metabolites are the products of cell metabolism and their functions are highly diverse. We wish to combine data from transcriptomics and metabolomics into one experimental setup in order to generate hypotheses about the regulatory processes between different molecular levels. While the biological processes between different levels of ‘‘omics’’ are highly complex, a combined analysis of metabolite and gene expression data will help discover and elucidate the underlying regulatory networks and identify genes that influence the metabolome because they – directly or indirectly – are involved in the metabolism. Metabolomic studies provide information on the metabolites found within a biological sample – for example from mass spectrometry data – and can be used to discriminate between the amount and types of metabolites in different samples or under different conditions

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.