Computational cell metabolism models seek to provide metabolic explanations of cell behavior under different conditions or following genetic alterations, help in the optimization of in vitro cell growth environments, or predict cellular behavior in vivo and in vitro. In the extremes, mechanistic models can include highly detailed descriptions of a small number of metabolic reactions or an approximate representation of an entire metabolic network. To date, all mechanistic models have required details of individual metabolic reactions, either kinetic parameters or metabolic flux, as well as information about extracellular and intracellular metabolite concentrations. Despite the extensive efforts and the increasing availability of high-quality data, required in vivo data are not available for the majority of known metabolic reactions; thus, mechanistic models are based primarily on ex vivo kinetic measurements and limited flux information. Machine learning approaches provide an alternative for derivation of functional dependencies from existing data. The increasing availability of metabolomic and lipidomic data, with growing feature coverage as well as sample set size, is expected to provide new data options needed for derivation of machine learning models of cell metabolic processes. Moreover, machine learning analysis of longitudinal data can lead to predictive models of cell behaviors over time. Conversely, machine learning models trained on steady-state data can provide descriptive models for the comparison of metabolic states in different environments or disease conditions. Additionally, inclusion of metabolic network knowledge in these analyses can further help in the development of models with limited data.This chapter will explore the application of machine learning to the modeling of cell metabolism. We first provide a theoretical explanation of several machine learning and hybrid mechanistic machine learning methods currently being explored to model metabolism. Next, we introduce several avenues for improving these models with machine learning. Finally, we provide protocols for specific examples of the utilization of machine learning in the development of predictive cell metabolism models using metabolomic data. We describe data preprocessing, approaches for training of machine learning models for both descriptive and predictive models, and the utilization of these models in synthetic and systems biology. Detailed protocols provide a list of software tools and libraries used for these applications, step-by-step modeling protocols, troubleshooting, as well as an overview of existing limitations to these approaches.
Read full abstract