Abstract

HomeCirculation: Cardiovascular ImagingVol. 14, No. 6Promise and Frustration: Machine Learning in Cardiology Free AccessEditorialPDF/EPUBAboutView PDFView EPUBSections ToolsAdd to favoritesDownload citationsTrack citationsPermissions ShareShare onFacebookTwitterLinked InMendeleyRedditDiggEmail Jump toFree AccessEditorialPDF/EPUBPromise and Frustration: Machine Learning in Cardiology Brandon K. Fornwalt, MD, PhD and John M. Pfeifer, MD, MPH Brandon K. FornwaltBrandon K. Fornwalt Brandon K. Fornwalt, MD, PhD, 100 N Academy Ave, Danville, PA, 17822-4400. Email E-mail Address: [email protected] https://orcid.org/0000-0002-6231-9442 Department of Translational Data Science and Informatics, Geisinger, Danville, PA (B.K.F., J.M.P.). Department of Radiology and the Heart Institute, Geisinger, Danville, PA (B.K.F.). Search for more papers by this author and John M. PfeiferJohn M. Pfeifer Department of Translational Data Science and Informatics, Geisinger, Danville, PA (B.K.F., J.M.P.). Heart and Vascular Center, Evangelical Hospital, Lewisburg, PA (J.M.P.). Search for more papers by this author Originally published15 Jun 2021https://doi.org/10.1161/CIRCIMAGING.121.012838Circulation: Cardiovascular Imaging. 2021;14:e012838This article is a commentary on the followingDeep Learning–Based Automated Echocardiographic Quantification of Left Ventricular Ejection Fraction: A Point-of-Care SolutionOther version(s) of this articleYou are viewing the most recent version of this article. Previous versions: June 15, 2021: Ahead of Print See Article by Asch et alWe are physician-researchers who believe in the promise of computer assistance (especially machine learning) to improve medicine for both patients and providers. However, as believers, we live within a dichotomy of both promise and frustration. On the one hand, we read a dizzying amount of material promising that machine learning will accomplish the seemingly impossible task of transforming clinical practice by both improving patient outcomes and reducing health care costs. We buy into this promise, as we have seen first-hand the magic of deep learning, for example to identify clinically unrecognized evidence of disease (strongly linked to mortality) in seemingly normal 12-lead ECG traces.1 Despite this promise, our clinical days are filled with increasing frustration as it seems not only that our smartphones are far smarter than the systems we use to both acquire and interpret clinical data but also that our colleagues keep asking for all the artificial intelligence solutions that they read about in countless review articles, Tweets, and opinion pieces; yet we have little to offer them but “Don’t worry, the revolution is coming, we promise!”Along these lines, it is refreshing to read the article by Asch et al.2 This tremendous cross-institutional team of physicians and researchers has taken a significant step forward to deliver on the promise of machine learning to assist clinicians and patients. The authors showed that not only can a machine automatically calculate the left ventricular ejection fraction (LVEF) from echocardiogram images with accuracy comparable to practicing cardiologists but it can also help largely untrained clinical staff acquire the best possible images. This concept of leveraging machines to augment rather than replace human clinicians should facilitate the easiest pathway to clinical adoption and ultimately result in positive patient impact for this exciting technology.So what is the next step? Why are we not seeing more studies like this one, and why are the echocardiography probes in our hospitals not already smarter as a result? There are several reasons for the frustratingly slow adoption of machine learning in medicine that we would like to take a moment to reflect on within the context of this important study by Asch et al.2The level of evidence required to adopt new clinical approaches that leverage machine learning is vague. We see this as an operating point decision, that is, where do we want to live as a society on the machine learning technology development receiver operating characteristic curve (Figure). Do we want to be highly sensitive and capture all potential health care machine learning technologies as quickly as possible or do we favor a more cautious approach? In the first instance we accept the risk of false positives that cost resources but do not benefit patients, whereas in the latter case, we prevent impactful machine learning technologies from ever reaching patients (ie, more false negatives). In the context of the current work by Asch et al,2 if we want to be on the highly sensitive portion of the receiver operating characteristic curve, it is time to start using this technology yesterday. However, if we are risk-averse and want to minimize false positives, the next step should be a randomized controlled trial of patient outcomes in the emergency room setting where the randomization step involves using versus not using the new technology. Although this second option for a randomized study may sound a bit far-fetched, is that what we need to avoid overselling a new clinical product or treatment?It is unclear who is going to pay for these technologies once they have demonstrated success (still to be defined) in the development phase. In our current world of fee-for-service medicine, the current procedural terminology (CPT) code reigns supreme. But it takes many years and tens of millions of dollars to establish a CPT code for a new technology. To overcome this challenge, the Centers for Medicare and Medicaid Services proposed a Medicare Coverage of Innovative Technology on docket number CMS-3372-P that provides a pathway for immediate reimbursement by Medicare of innovative new medical devices designated as breakthrough by the Food and Drug Administration.3 The proposal allows companies to be reimbursed for their technology for 4 years during which time large clinical studies can be conducted to establish the true clinical impact. This pathway would facilitate operating at the highly sensitive point on the machine learning technology development receiver operating characteristic curve (Figure). However, there is some uncertainty on how this will unfold, as the most recent relevant notification has delayed the rollout of the Medicare Coverage of Innovative Technology program.4 Alternative payment models that focus on value-based care may provide another route for reimbursement of these new technologies if there is evidence they can improve care without increasing costs.The black box nature of machine learning models is inhibiting clinical adoption according to some authors.5,6 Although we agree that it is generally good practice to understand as much as possible about how we are treating and diagnosing patients, health care is filled with black boxes that we accept every day because there is real-world evidence that using these black boxes improves patient care. For example, the exact mechanisms that underly the benefit of most therapies for heart failure with reduced ejection fraction remain incompletely understood. This includes beta-blockers, ACE (angiotensin-converting enzyme) inhibitors, mineralocorticoid antagonists, and now SGLT2 (sodium-glucose co-transporter-2) inhibitors. If we had demanded full understanding of those mechanisms before clinical implementation, treatment of heart failure would remain where it was 40 years ago and decades of patient benefit may have gone unrealized. Perhaps we should focus more on patient outcomes rather than worrying as much about the black box—if any technology or treatment demonstrates improved patient outcomes and a robust safety profile in well-designed clinical studies (eg, with randomization steps and clinical outcome end points that are important to patients), then should it matter if the technology is somewhat of a black box?The poor interoperability of our health care delivery infrastructure is a massive hurdle that hinders development and adoption of machine learning technologies. Our health care infrastructure is broken. Systems manufactured and managed by different companies with minimal standards for data and data transfer make health care a very difficult landscape for new technology. To put this in the words of Dr Paul Chang who spoke on this topic at the NVIDIA GTC conference in 2019,7 we are building fancy cars (machine learning models) but we do not have the gas and roads to drive them (our broken infrastructure). This is why a company like Caption Health, responsible for resourcing the development of the work in Asch et al’s article,2 not only spent a lot of time and money developing algorithms to enable their work but they also built their own ultrasound machine to be able to deploy these algorithms. So, we have a company trying to improve patient outcomes with machine learning that had to morph into a medical hardware device manufacturer to adapt to our broken health care infrastructure. This does not make sense and seems like wasted resources. In an ideal world, all medical devices (such as echocardiography ultrasound systems) would work within well-defined standards that could be leveraged and highly interoperable through communication protocols like digital imaging and communications in medicine (DICOM), fast healthcare interoperability resources (FHIR), etc, to facilitate speed of development and deployment of new technologies.Finally, we as clinicians and researchers have done a poor job defining the unmet clinical needs for data scientists to focus on solving. This reminds us of a recent book written by 2 former Amazon executives (Colin Bryar and Bill Carr) entitled Working Backwards.8 In their book, Bill and Colin8 describe Amazon’s process for vetting new projects according to the process called PR FAQ, which stands for Press Release (PR) and Frequently Asked Questions (FAQs). The concept is to work backwards by writing the PR for any new product before beginning development of that product. Additionally, the FAQ section is written before beginning development and is meant to answer common questions and concerns that may arise about the product at the time of the PR. This PR FAQ process allows teams or companies to thoroughly flesh out ideas before spending millions of dollars and woman/man-hours on the development of a new product. This makes us wonder, in the context of the current work by Asch et al,2 did this team set out to solve the unmet clinical need of untrained nurses needing to be able to acquire a point of care echocardiogram to automatically determine whether patients in the emergency room had a normal LVEF or not? The FAQ section for such a product, had it been written before development, would have raised many potential hurdles to overcome: (1) why is it better to obtain a measure of LVEF via point of care ultrasound in the emergency department as opposed to waiting for dedicated cardiac sonographers and an expert cardiologist assessment? (2) Is it enough to know LVEF alone or do we also need an assessment of the valves, pericardium, etc? (3) Can this technology replace the current standard (dedicated cardiac sonographers and expert cardiologist assessment) or does it just commit a patient to undergoing 2 echocardiograms, the first in the emergency department and then later in the inpatient or outpatient setting? (4) Is there truly a market that will pay for this in the United States or is this meant for countries with substantially less health care resources?Download figureDownload PowerPointFigure. The machine learning (ML) technology development receiver operating characteristic (ROC) curve. This curve is meant to portray the upside and downside of early vs late clinical implementation of ML models in medicine by examining different potential operating points. Note that we are assuming some ability of our decisions as a society (eg, the Medicare Coverage of Innovative Technology rule as discussed) to prioritize clinical implementation of ML models with positive clinical impact; otherwise, the curve will be at the random chance location with an area under the ROC curve of 0.5. Orange X denotes highly sensitive operating point; green X denotes highly specific operating point.Frustratingly, despite the promise of this incredible work and the number of hurdles this team had to overcome (reminder: they built their very own ultrasound machines to deploy their machine learning technology), we are left with many clinical questions that will take a lot more resources and time to address. Despite this, Caption Health and Asch et al2 should be commended for taking a shot on goal—they are looking to the bright future of health care, trying to reimagine a world in which computers are truly helping clinicians improve patient outcomes. We all know from our high school sports coaches that “you miss 100% of the shots you don’t take.” So, thank you to this incredible team for taking a shot on goal. Let us also band together as believers of a better future for health care and start writing our PR FAQs for the next set of machine learning technologies that we want to be developed by Caption Health and other innovative companies. Maybe this approach will help us finally start to deliver on the promise of machine learning in cardiology while minimizing the frustration we feel every day for our patients knowing that things just are not moving fast enough.Disclosures Geisinger receives funding from Tempus for ongoing development of predictive modeling technology and commercialization. Tempus and Geisinger have jointly applied for predictive modeling patents. None of the Geisinger employees have ownership interest in any of the intellectual property resulting from the partnership.FootnotesThe opinions expressed in this article are not necessarily those of the editors or of the American Heart Association.Brandon K. Fornwalt, MD, PhD, 100 N Academy Ave, Danville, PA, 17822-4400. Email brandon.[email protected]com

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call