Abstract

Omic data analysis is steadily growing as a driver of basic and applied molecular biology research. Core to the interpretation of complex and heterogeneous biological phenotypes are computational approaches in the fields of statistics and machine learning. In parallel, constraint-based metabolic modeling has established itself as the main tool to investigate large-scale relationships between genotype, phenotype, and environment. The development and application of these methodological frameworks have occurred independently for the most part, whereas the potential of their integration for biological, biomedical, and biotechnological research is less known. Here, we describe how machine learning and constraint-based modeling can be combined, reviewing recent works at the intersection of both domains and discussing the mathematical and practical aspects involved. We overlap systematic classifications from both frameworks, making them accessible to nonexperts. Finally, we delineate potential future scenarios, propose new joint theoretical frameworks, and suggest concrete points of investigation for this joint subfield. A multiview approach merging experimental and knowledge-driven omic data through machine learning methods can incorporate key mechanistic information in an otherwise biologically-agnostic learning process.

Highlights

  • Today, the search for biological mechanisms at molecular scale can leverage an unprecedented amount of information

  • We show that mining and integrating experimental and genome-scale metabolic models (GSMMs)-generated multiomic data with machine learning techniques can unveil unknown mechanisms in a sample-specific manner, identifying relevant targets for biotechnology and biomedicine

  • The use of machine and deep learning in computational and systems biology will keep growing in parallel with the rapid advancement of high-throughput omic technologies

Read more

Summary

OPEN ACCESS

Citation: Zampieri G, Vijayakumar S, Yaneske E, Angione C (2019) Machine and deep learning meet genome-scale metabolic modeling. PLoS Comput Biol 15(7): e1007084. https://doi.org/10.1371/ journal.pcbi.1007084 Funding: CA received funding from the Biotechnology and Biological Sciences Research Council (BBSRC), grants CBMNet-PoC-D0156 and NPRONET- BIV-015 (BB/L013754/1) (URLs: https://bbsrc.ukri.org/; http://www.cbmnetnibb.net/ ; https://npronet.com/). GZ and CA were also supported by the "Health and wellbeing" grand challenge at Teesside University (URL: https:// www.tees.ac.uk/sections/research/healthwellbeing/ index.cfm). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. Competing interests: The authors have declared that no competing interests exist.

Introduction
Types of machine learning approaches
Machine learning for multiomic data
Supervised fluxomic analysis
PCA PCA Hierarchical clustering PCA PCA
Decision trees
Prediction of growth conditions
Reaction essentiality prediction Drug target prediction
Study Data integration approach Machine learning component
Unsupervised fluxomic analysis
Supervised multiomic analysis
Unsupervised multiomic analysis
Advantages and limitations of expanding the multiomic array in silico
Emerging applications
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call