Abstract

Constantly decreasing costs of high-throughput profiling on many molecular levels generate vast amounts of multi-omics data. Studying one biomedical question on two or more omic levels provides deeper insights into underlying molecular processes or disease pathophysiology. For the majority of multi-omics data projects, the data analysis is performed level-wise, followed by a combined interpretation of results. Hence the full potential of integrated data analysis is not leveraged yet, presumably due to the complexity of the data and the lacking toolsets. We propose a versatile approach, to perform a multi-level fully integrated analysis: The Knowledge guIded Multi-Omics Network inference approach, KiMONo (https://github.com/cellmapslab/kimono). KiMONo performs network inference by using statistical models for combining omics measurements coupled to a powerful knowledge-guided strategy exploiting prior information from existing biological sources. Within the resulting multimodal network, nodes represent features of all input types e.g. variants and genes while edges refer to knowledge-supported and statistically derived associations. In a comprehensive evaluation, we show that our method is robust to noise and exemplify the general applicability to the full spectrum of multi-omics data, demonstrating that KiMONo is a powerful approach towards leveraging the full potential of data sets for detecting biomarker candidates.

Highlights

  • Decreasing costs of high-throughput profiling on many molecular levels generate vast amounts of multi-omics data

  • We presented KiMONo—a novel prior Knowledge guided Multi-Omics Network inference method

  • The algorithm builds a statistical model for each gene, selects the most predictive features and uses these to assemble a multi-level network

Read more

Summary

Introduction

Decreasing costs of high-throughput profiling on many molecular levels generate vast amounts of multi-omics data. More sophisticated latent factor-based models have been introduced, capable of analysing multiple omic levels ­simultaneously[4,5] These methods infer lower-dimensional representations (latent factors) of the original high dimensional multi-omic data space. Improved interpretability is one of the big advantages of network based approaches These identify condition specific key molecules via inferring and analysing a network representation of the ­processes[6,7]. To increase the specificity one can use more advanced machine learning approaches, instead of correlation, to identify associations between ­nodes[9,10] These methods are only applicable to high dimensional multi-omic data with large amounts of samples. MiRlastic facilitates prior knowledge to increase the performance for high dimensional and low sample size data analysis

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call