Prediction of Human Induced Pluripotent Stem Cell Cardiac Differentiation Outcome by Multifactorial Process Modeling.

Bianca Williams,Ferdous Finklea,Felix Manstein,Wiebke Löbel,Samira Mohammadi,Caroline Halloin,Robert Zweigerdt,Mohammadjafar Hashemi,Katharina Ritzenhoff,Selen Cremaschi,Elizabeth Lipke

doi:10.3389/fbioe.2020.00851

Bianca Williams, Ferdous Finklea + Show 9 more

Open Access

https://doi.org/10.3389/fbioe.2020.00851

Copy DOI

Abstract

Human cardiomyocytes (CMs) have potential for use in therapeutic cell therapy and high-throughput drug screening. Because of the inability to expand adult CMs, their large-scale production from human pluripotent stem cells (hPSC) has been suggested. Significant improvements have been made in understanding directed differentiation processes of CMs from hPSCs and their suspension culture-based production at chemically defined conditions. However, optimization experiments are costly, time-consuming, and highly variable, leading to challenges in developing reliable and consistent protocols for the generation of large CM numbers at high purity. This study examined the ability of data-driven modeling with machine learning for identifying key experimental conditions and predicting final CM content using data collected during hPSC-cardiac differentiation in advanced stirred tank bioreactors (STBRs). Through feature selection, we identified process conditions, features, and patterns that are the most influential on and predictive of the CM content at the process endpoint, on differentiation day 10 (dd10). Process-related features were extracted from experimental data collected from 58 differentiation experiments by feature engineering. These features included data continuously collected online by the bioreactor system, such as dissolved oxygen concentration and pH patterns, as well as offline determined data, including the cell density, cell aggregate size, and nutrient concentrations. The selected features were used as inputs to construct models to classify the resulting CM content as being “sufficient” or “insufficient” regarding pre-defined thresholds. The models built using random forests and Gaussian process modeling predicted insufficient CM content for a differentiation process with 90% accuracy and precision on dd7 of the protocol and with 85% accuracy and 82% precision at a substantially earlier stage: dd5. These models provide insight into potential key factors affecting hPSC cardiac differentiation to aid in selecting future experimental conditions and can predict the final CM content at earlier process timepoints, providing cost and time savings. This study suggests that data-driven models and machine learning techniques can be employed using existing data for understanding and improving production of a specific cell type, which is potentially applicable to other lineages and critical for realization of their therapeutic applications.

Highlights

The heart is one of the least regenerative organs in the body; when disease or damage occurs to the myocardium, native cardiac muscle cells, cardiomyocytes (CMs), are replaced with fibrotic scar tissue
Classification models were constructed for predicting the outcome of the bioreactor experiments on dd10 using features measured up to dd7 and up to dd5, using each of the machine learning techniques described in Sections “Multivariate Adaptive Regression Splines,” “Random Forests,” and “Gaussian Process Regression.”
Results were obtained using leave one out (LOO) cross-validation and are presented for both the bioprocess features selected by the built-in feature selection for each model, as well as for the principal components (PCs) obtained from principal component analysis (PCA)

Summary

Introduction

The heart is one of the least regenerative organs in the body; when disease or damage occurs to the myocardium, native cardiac muscle cells, cardiomyocytes (CMs), are replaced with fibrotic scar tissue. Due to the large number of patients that suffer from cardiovascular disease along with the vast number of cells presumably needed for a therapeutic effect, scalable production of CMs in a consistent and reproducible manner is critical for the clinical translation and success of these treatments. The resulting variability in endpoint cell purity, or CM content, together with time constraints, CMs’ phenotype and maturity impede commercial production and progress to clinical translation. This precludes the use of hPSCCMs for other mass applications, including high-throughput screenings for drug development and safety pharmacology (Fonoudi et al, 2015; Sun and Nunes, 2017; Machiraju and Greenway, 2019) and faster progress in cardiac tissue engineering (Kensah et al, 2013)

Objectives

Methods

Results

Conclusion