Abstract

SummaryCombining a computational framework for flux balance analysis with machine learning improves the accuracy of predicting metabolic activity across conditions, while enabling mechanistic interpretation. This protocol presents a guide to condition-specific metabolic modeling that integrates regularized flux balance analysis with machine learning approaches to extract key features from transcriptomic and fluxomic data. We demonstrate the protocol as applied to Synechococcus sp. PCC 7002; we also outline how it can be adapted to any species or community with available multi-omic data.For complete details on the use and execution of this protocol, please refer to Vijayakumar et al. (2020).

Highlights

  • (Blazier and Papin, 2012; Ebrahim et al, 2016; Li et al, 2018)

  • SUMMARY Combining a computational framework for flux balance analysis with machine learning improves the accuracy of predicting metabolic activity across conditions, while enabling mechanistic interpretation

  • This protocol presents a guide to condition-specific metabolic modeling that integrates regularized flux balance analysis with machine learning approaches to extract key features from transcriptomic and fluxomic data

Read more

Summary

11. Import the datasets into MATLAB:

’.1’) since transcripts are indicated with ’.1’ in the model but these are not present in the dataset expression = ’[.]\d’; replace = ’’; genes_truncated = regexprep(genes,expression,replace); % Set gene expression to the set of transcript fold changes in the selected growth condition for i = 1:numel(genes) position = find(strcmp(genes_truncated{i},genes_in_dataset)); if $isempty(position) pos_genes_in_dataset(i) = position; x(i) = expr_profile(pos_genes_in_dataset(i)); end end % Specify the number of variables V = numel(genes); % Calculate flux rates for the dark oxic condition [v1_do, f_out_do] =evaluate_objective_minNorm(x,M,V,fbamodel,genes,reaction_expression, pos_genes_in_react_expr,ixs_genes_sorted_by_length); 10. In our case study, reactions involved in succinate dehydrogenation (SUCD1Itlm/SUCD1Icpm), efflux (SUCCt2b) or exchange (EX_succ_E) were found to be positively correlated with growth for all three objective pairs and were identified among the highest positive correlations when analyzing the concatenated dataset of gene transcripts and Biomass - ATP maintenance flux data (Vijayakumar et al, 2020) These reactions are encoded by A1094 and A2569, which had relatively low gene expression and variability across growth conditions (ranging between 0.33 to 3.74 and 0.14 to 3.66, respectively).

26. Merge all five arrays into a single list of indices for all subsystems:
33. Select only the columns required
35. We begin by loading the required variables into MATLAB:
42. The kmeans function is used to perform clustering using the following command:
55. Plot a bar chart using the mean values:
58. Plot the number of reactions in each bin and subsystem using a heatmap:
LIMITATIONS

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.