Abstract

Knowing the catalytic turnover numbers of enzymes is essential for understanding the growth rate, proteome composition, and physiology of organisms, but experimental data on enzyme turnover numbers is sparse and noisy. Here, we demonstrate that machine learning can successfully predict catalytic turnover numbers in Escherichia coli based on integrated data on enzyme biochemistry, protein structure, and network context. We identify a diverse set of features that are consistently predictive for both in vivo and in vitro enzyme turnover rates, revealing novel protein structural correlates of catalytic turnover. We use our predictions to parameterize two mechanistic genome-scale modelling frameworks for proteome-limited metabolism, leading to significantly higher accuracy in the prediction of quantitative proteome data than previous approaches. The presented machine learning models thus provide a valuable tool for understanding metabolism and the proteome at the genome scale, and elucidate structural, biochemical, and network properties that underlie enzyme kinetics.

Highlights

  • Knowing the catalytic turnover numbers of enzymes is essential for understanding the growth rate, proteome composition, and physiology of organisms, but experimental data on enzyme turnover numbers is sparse and noisy

  • Network properties were extracted from a genome-scale models (GEMs) of E. coli K-12 MG1655, iML151526: The average flux across diverse growth conditions was obtained with a Monte Carlo sampling approach and parsimonious FBA27

  • The diversity of biochemical reactions renders genome-scale experimental characterization of enzyme kinetics a task of prohibitive complexity

Read more

Summary

Introduction

Knowing the catalytic turnover numbers of enzymes is essential for understanding the growth rate, proteome composition, and physiology of organisms, but experimental data on enzyme turnover numbers is sparse and noisy. In practice, in vitro assays of enzyme activity are sensitive to a variety of extraction and assay parameters, leading to noisy estimates and rendering large-scale estimation of kcat in vitro difficult (see Bar-Even et al.[15] for discussion) To address this issue and to provide estimates of keff in vivo, proteomic data across diverse growth conditions was recently combined with in silico flux predictions to calculate kapp,max, the maximal keff across conditions[14]. We combine known correlates of kcat with novel features for enzyme structure, biochemical mechanism, network context, and assay conditions to build ML models of kcat in vitro and kapp,max that can predict these parameters at the genome scale Application of these ML models to the parameterization of mechanistic GEMs enables improved predictions of proteome allocation

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call