Abstract

Deep learning, a family of machine learning models that use artificial neural networks, has achieved great success at predicting outcomes in nonmedical domains. To examine whether deep learning recurrent neural network (RNN) models that use raw longitudinal data extracted directly from electronic health records outperform conventional regression models in predicting the risk of developing hepatocellular carcinoma (HCC). This prognostic study included 48 151 patients with hepatitis C virus (HCV)-related cirrhosis in the national Veterans Health Administration who had at least 3 years of follow-up after the diagnosis of cirrhosis. Patients were identified by having at least 1 positive HCV RNA test between January 1, 2000, to January 1, 2016, and were followed up from the diagnosis of cirrhosis to January 1, 2019, for the development of incident HCC. A total of 3 models predicting HCC during a 3-year period were developed and compared, as follows: (1) logistic regression (LR) with cross-sectional inputs (cross-sectional LR); (2) LR with longitudinal inputs (longitudinal LR); and (3) RNN with longitudinal inputs. Data analysis was conducted from April 2018 to August 2020. Development of HCC. Area under the receiver operating characteristic curve, area under the precision-recall curve, and Brier score. During a mean (SD) follow-up of 11.6 (5.0) years, 10 741 of 48 151 patients (22.3%) developed HCC (annual incidence, 3.1%), and a total of 52 983 samples (51 948 [98.0%] from men) were collected. Patients who developed HCC within 3 years were older than patients who did not (mean [SD] age, 58.2 [6.6] years vs 56.9 [6.9] years). RNN models had superior mean (SD) area under the receiver operating characteristic curve (0.759 [0.009]) and mean (SD) Brier score (0.136 [0.003]) than cross-sectional LR (0.689 [0.009] and 0.149 [0.003], respectively) and longitudinal LR (0.682 [0.007] and 0.150 [0.003], respectively) models. Using the RNN model, the samples with the mean (SD) highest 51% (1.5%) of HCC risk, in which 80% of all HCCs occurred, or the mean (SD) highest 66% (1.2%) of HCC risk, in which 90% of all HCCs occurred, could potentially be targeted. Among samples from patients who achieved sustained virologic response, the performance of the RNN models was even better (mean [SD] area under receiver operating characteristic curve, 0.806 [0.025]; mean [SD] Brier score, 0.117 [0.007]). In this study, deep learning RNN models outperformed conventional LR models, suggesting that RNN models could be used to identify patients with HCV-related cirrhosis with a high risk of developing HCC for risk-based HCC outreach and surveillance strategies.

Highlights

  • Patients with chronic hepatitis C virus (HCV) infection have a high risk of developing hepatocellular carcinoma (HCC)

  • Model Performance Among All Samples The recurrent neural network (RNN) model resulted in significantly higher mean (SD) area under the receiver operating characteristic curve (AUROC) (0.759 [0.009]), a measure of discrimination, than the longitudinal logistic regression (LR) (0.689 [0.009]) or cross-sectional LR (0.682 [0.007]) models without feature selection (P < .001 for both comparisons) (Table 2 and Figure 2A)

  • We demonstrated an application for RNN models that outperformed conventional LR models in the prediction of HCC risk in patients with HCV-related cirrhosis, including those who achieve sustained virologic response (SVR) following antiviral therapy

Read more

Summary

Introduction

Patients with chronic hepatitis C virus (HCV) infection have a high risk of developing hepatocellular carcinoma (HCC). The risk of HCC increases among patients with HCV infection when they develop advanced fibrosis or cirrhosis. The risk decreases after HCV eradication,[1,2,3,4,5] which is becoming increasingly common. Conventional regression models have recently been developed to estimate the risk of HCC in patients with HCV according to the presence or absence of cirrhosis, response to antiviral treatment, and a small number of routinely available baseline clinical characteristics.[6]

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call