Abstract

HomeCirculation: Heart FailureVol. 15, No. 1Unleashing the Power of Machine Learning to Predict Myocardial Recovery After Left Ventricular Assist Device: A Call for the Inclusion of Unstructured Data Sources in Heart Failure Registries Free AccessEditorialPDF/EPUBAboutView PDFView EPUBSections ToolsAdd to favoritesDownload citationsTrack citationsPermissions ShareShare onFacebookTwitterLinked InMendeleyReddit Jump toFree AccessEditorialPDF/EPUBUnleashing the Power of Machine Learning to Predict Myocardial Recovery After Left Ventricular Assist Device: A Call for the Inclusion of Unstructured Data Sources in Heart Failure Registries Ramsey M. Wehbe, MD, MSAI Ramsey M. WehbeRamsey M. Wehbe Correspondence to: Ramsey M. Wehbe, MD, MSAI, Division of Cardiology, Department of Medicine, Northwestern University Feinberg School of Medicine, 676 N St. Clair St, Ste 600, Chicago, IL 60611. Email E-mail Address: [email protected] https://orcid.org/0000-0003-0599-7957 Division of Cardiology, Department of Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL. Search for more papers by this author Originally published24 Dec 2021https://doi.org/10.1161/CIRCHEARTFAILURE.121.009278Circulation: Heart Failure. 2022;15:e009278This article is a commentary on the followingMachine Learning-Based Prediction of Myocardial Recovery in Patients With Left Ventricular Assist Device SupportOther version(s) of this articleYou are viewing the most recent version of this article. Previous versions: December 24, 2021: Ahead of Print “There are only patterns, patterns on top of patterns, patterns that affect other patterns. Patterns hidden by patterns. Patterns within patterns[…]What we call chaos is just patterns we haven’t recognized. What we call random is just patterns we can’t decipher[…]”—Chuck Palahniuk, Survivor1See Article by Topkara et alArtificial intelligence has recently garnered significant attention in popular media as advances in machine learning (ML), and particularly deep learning (DL), have made possible groundbreaking innovations, such as self-driving cars and voice-controlled virtual assistants. Cardiovascular medicine has not been immune to the enthusiasm surrounding ML, as evident by the exponential growth of publications in this space over the past 10 years (Figure). ML models have been employed across the spectrum of cardiovascular disease, particularly for patients with heart failure (HF), to automate time-consuming tasks, assist in the diagnosis or detection of disease, deliver insights into new disease phenotypes and pathophysiologic mechanisms, and—perhaps the most elusive task of all—accurately predict patient outcomes.Download figureDownload PowerPointFigure. Publication timeline showing exponential growth of publications by year for the topic of machine learning in cardiovascular medicine. Data were exported from pubmed.ncbi.nlm.nih.gov.In this edition of Circulation: Heart Failure, Topkara et al2 take on the important task of predicting myocardial recovery after durable left ventricular assist device (LVAD) implantation. The RESTAGE-HF clinical trail (Remission from Stage D Heart Failure) primary end point results were recently published showing durable recovery in a large proportion of carefully selected LVAD patients on a standardized treatment protocol.3 Appropriate selection of patients for specialized care plans to maximize the probability of myocardial recovery is, therefore, key to efficiently allocate resources. Although statistical risk prediction models for myocardial recovery after LVAD exist,4 this study is novel in the application of ML methodology to the problem.In a population of over 20 000 patients from the Society of Thoracic Surgeons INTERMACS (Interagency Registry for Mechanically Assisted Circulatory Support) database, the authors first used least absolute shrinkage and selection operator (LASSO) logistic regression for feature selection among 98 possible risk factors derived from discrete variables included in the database. Next, they used the resulting 28 variables (or features in ML terminology) to evaluate the discriminative ability of 5 different ML models (Bayesian logistic regression, support vector machine, gradient boosted decision tree, neural network, and random forest) in predicting myocardial recovery, which was defined as LVAD explant for myocardial recovery. The authors reported that these ML models (area under the receiver operator characteristic curve [AUC] 0.813–0.824) all outperformed established statistical regression-based recovery prediction models derived in earlier versions of the INTERMACS database (AUC 0.744–0.748) and identified a set of previously underappreciated features, including a history of noncompliance, tobacco/alcohol use, and limited social support at the time of LVAD implant, that were seemingly paradoxically associated with myocardial recovery. This is an important step towards early identification of patients after LVAD implantation who might benefit from intensive guideline directed medical therapies for HF to maximize chances of myocardial recovery. It should be noted that only one in 5 patients predicted to recover by the ML model underwent device explant for myocardial recovery at 4 years postimplant, driven, in part, by the overall low incidence of myocardial recovery in LVAD patients.Interestingly, when the authors derived a novel risk score using a basic logistic regression model on the same contemporary INTERMACS data set used to train the ML models, the best performing ML model only marginally outperformed this simpler statistical model (AUC 0.824 versus 0.796, P=0.046). This finding is consistent with prior risk prediction studies in HF cohorts comparing ML methods to traditional statistical risk modeling, which have demonstrated minimal incremental improvement in prediction metrics with ML.5 However, this is not necessarily a shortcoming of ML methodology, but rather its implementation. Namely, the simple application of increasingly complex mathematical functions to the same discrete data elements has quickly diminishing returns.As the authors point out, one limitation of the current study is that more complex, unstructured source data (eg, raw imaging data, free text from clinical notes, ECG and hemodynamic waveforms, genome sequencing, and time-series data) was not available given the INTERMACS database consists almost exclusively of structured, tabular data. However, one could argue the primary advantage of modern ML models, chiefly DL architectures, is in the ability to effectively model complex, unstructured data sources in ways that were not previously possible using traditional statistical or mathematical modeling. As opposed to the simple single hidden-layer neural networks utilized in the current study, deep neural networks are capable of modeling complex data inputs via a series of learned features and nonlinear transformations without the need for manual feature preprocessing, instead extracting important features for a specific task in an automated fashion. This characteristic of DL models has made them particularly adept at computer vision and natural language processing tasks, dramatically outperforming the previous state of the art in these domains. Unsurprisingly, prior investigators have found that including imaging,6 clinical notes,7 ECG waveforms,8 wearable sensor data,9 or longitudinal data10 in a DL framework improves the performance of risk prediction modeling compared to the use of tabular data alone. There have also been encouraging results yielded by DL models that are able to efficiently handle multiple different types of these unstructured data sources at once as inputs into a multimodal framework.11Clearly, there is promise in analyzing rich unstructured data sources towards unlocking hidden patterns in this data at scale and reducing some of the inherent stochasticity involved in HF outcomes prediction. However, there are a few considerations related to this approach that deserve mention. First, models used to predict outcomes in patients with HF must be explainable or interpretable to be clinically useful. Although the paradigm has traditionally been that more complex data sets and more complex modeling leads to decreased transparency into how a model arrived at a certain prediction (the black box of ML), significant progress has been made and there is active research into lifting the lid on these models using methods such as heatmaps for visual explanations of model predictions.12 Second, more complex, unstructured data sources typically require larger amounts of data for training to prevent overfitting, a phenomenon that limits a model’s generalizability to external data sets. Indeed, it was the curation of large publicly available unstructured data sets such as ImageNet,13 a collection of over 14 million labeled images from natural scenes, that paved the way for the broad success of modern DL-based computer vision systems. While we are starting to see the emergence of large, anonymized, publicly available data sets of cardiovascular imaging14 and free-text clinical reports,15 the applicability of these data sets to the task of predicting outcomes in HF is limited due to the lack of robust clinical and outcomes data of similar quality to that included in clinical registries.Unfortunately, existing large HF registries do not routinely provide such unstructured data. While this data might be available through an individual participating site’s core lab, it is exceedingly difficult to obtain raw imaging data, for example, for an entire registry cohort. One notable exception is the recent launch of the National Heart, Lung, and Blood Institute’s HeartShare program, a goal of which is to explicitly aggregate unstructured data sources including phenotypic data, images, and omics from patients with HF with preserved ejection fraction for large-scale analysis to elucidate mechanisms of disease and identify new targets for therapeutic intervention. This should serve as an example of the standard for modern HF registries—only when our data sets begin to match the capabilities of modern ML algorithms will we unleash the true potential of these technologies for HF outcomes prediction.Article InformationDisclosuresDr Wehbe has received research support from Pfizer and the American Society of Nuclear Cardiology that is outside the scope of this work and not relevant to this editorial.FootnotesThe opinions expressed in this article are not necessarily those of the editors or of the American Heart Association.For Disclosures, see page 30.Correspondence to: Ramsey M. Wehbe, MD, MSAI, Division of Cardiology, Department of Medicine, Northwestern University Feinberg School of Medicine, 676 N St. Clair St, Ste 600, Chicago, IL 60611. Email ramsey.[email protected]edu

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call