Area Under The Receiver Operating Characteristic Research Articles

Background: AML is a life-threatening disease, and to determine which patients need allogeneic stem cell transplantation, hematologists risk-stratify each case. However, standard risk stratification using the European LeukemiaNet (ELN) criteria is focused on baseline mutations and chromosomal aberrations, and the risk estimate is not updated during a patient's course. In other blood cancers, recalculating the risk with treatment response data can help guide the need for more intensive therapy (Kurtz, et al, Cell, 2019). Furthermore, deep learning graph neural networks (GNN) applied to EHR data have strong predictive power in a hematology context (Fouladvand, et al, J Biomed Inform, 2023). Thus, we evaluated the power of a GNN to predict survival in AML using longitudinal EHR data, specifically with labs and histological features that are not included in the ELN but may capture the treatment response. Methods: Patients who were seen at the Stanford Cancer Institute, had EHR data available within six months of diagnosis, and were diagnosed with AML between June 1998 and January 2021 were included in this retrospective analysis. The GNN was trained to predict survival at two years from diagnosis using the first six months of clinical data. Patients were excluded if they were lost to follow-up before two years or died before six months. Data were collected from structured databases associated with Stanford's EHR, except that diagnosis dates were from Stanford's Cancer Registry, and survival data was supplemented with other databases including the Social Security Death Index. Dysplasia, bone marrow cellularity, and bone marrow blast percentages from pathology reports (“pathology report data”) were extracted using text processing algorithms and weakly supervised machine learning (Ratner, et al, ArXiv, 2017). To represent time series information, we framed each patient's timeline as a network (or “graph”) of events. The primary GNN model was a heterogenous graph transformer classifier with two node types: complete blood count (CBC) data and pathology report data (Hu, et al, ArXiv, 2020). Data from the same week were assumed to be from the same timeframe and connected with bidirectional edges. Data separated by longer time periods were connected with unidirectional edges of a separate edge type. The independent test dataset consisted of patients whose ELN 2022 classification was available, and to train the model, the remaining data were divided into train/validation splits of 0.9/0.1. Results: Of the 2,535 patients with survival data, 1,029 met inclusion criteria. Table 1 summarizes the data available in the EHR for each variable, and nearly all patients had CBC and pathology report data. The area under the receiver operating characteristic (AUROC) using the ELN 2022 criteria for predicting survival in the test dataset was 0.79. The AUROC curve for the GNN model was comparable at 0.76, despite not using any variables from the ELN criteria, and the model effectively stratified patients' disease into high- and low-risk in the independent test dataset (hazard ratio [HR] 3.0, log-rank p = 0.0009). Interestingly, despite not having access to mutation or cytogenetic data, the high-risk cases were enriched in known high-risk mutations, like TP53 and RUNX1, and in high-risk chromosomal aberrations, like 5q deletion (Table 1). Although the model predictions correlated with the ELN criteria in some ways, they also stratified the ELN intermediate-risk AML cases into high and low risk (HR 6.1 for model-predicted high risk among ELN intermediate cases, p = 0.07). Conclusions: Risk stratification using artificial intelligence and longitudinal data from the EHR performed comparably to the ELN 2022 criteria and has the potential to further stratify the ELN categories. The model performed well despite only using histological features and lab values, which are more readily available and more frequently updated than next-generation sequencing results. In the future, this approach may further improve with a larger sample size and additional variables, such as measurable residual disease and treatment information. Given the heterogeneity and increasing complexity of AML classification, leveraging artificial intelligence to assist with classification will be crucial, and these results are a step towards a future where data are automatically extracted from the EHR and used for continuously updated risk stratification.

Background: Circulating tumor DNA (ctDNA) assessment is effective in diffuse large B-cell lymphoma (DLBCL) monitoring and risk stratification, with prognostic utility throughout first-line (1L) therapy. DLBCL ctDNA assays vary in analytical sensitivity, or limit of detection (LOD), which range from parts per thousand to below 1 part per million (1 in 10 6). With differing assays and time-points assessed in DLBCL studies, the relationship between analytical and clinical sensitivity for outcome prediction remains unclear. We assessed the prognostic ability of ctDNA at various LODs and used modeling strategies to project the efficacy of assays for minimal residual disease (MRD) detection. Methods: We previously reported a DLBCL dataset assessing plasma ctDNA before, during and after curative-intent 1L anthracycline-based treatment from multiple prospective studies, including RCHOP or EP[O]CH with acalabrutinib, lenalidomide, obinutuzumab, and polatuzumab (Roschewski et al. ASH 2022). In this study, ctDNA was evaluated by Phased Variant Enrichment Detection and Sequencing (PhasED-Seq, Foresight Diagnostics), a ctDNA MRD assay with LOD below 1 in 10 6. Samples from this study and a prospective DLBCL cohort treated with standard 1L therapy at Samsung Medical Center were considered in this analysis. We assessed the prognostic ability of ctDNA detection, considering LODs from 1 in 100 (10 2)to 1 in 10 6 to predict progression free survival (PFS) before, during and after 1L treatment. We assessed the desired sensitivity for ctDNA MRD during treatment by generating patient-specific log-linear models assuming exponential decay. We projected the distribution of variant allele frequencies (VAFs) during therapy from cycles 2 to 5 for patients who progressed to determine the minimal acceptable analytical sensitivity. Results: We included 230 patients consisting of 588 ctDNA samples, with 201 before therapy, 71 at C2D1, 101 at C3D1, 70 at C4D1 and 145 at end of therapy (EOT). Median follow-up was 17.5 months and 62 patients (27%) progressed. To evaluate the impact of LOD at treatment milestones, we applied a threshold for ctDNA positivity ranging from 1 in 10 2 to 1 in 10 6. Increased LOD had no effect on the performance of ctDNA before therapy (Figure A & B). At C2D1, if the LOD was at least 1 in 10 4, there was no difference in MRD prognostic performance. Starting at C3D1, improving the LOD for ctDNA positivity down to 1 in 10 6 showed superior predictive power for PFS at 24 months. The power of ctDNA to predict PFS at 24 months improved later in therapy, with area under the receiver operator curves (AUROCs) for PFS24 of 0.68, 0.73, 0.77, 0.88, and 0.86, at pretreatment, C2D1, C3D1, C4D1, and EOT time-points respectively (Figure A). To further understand the desired LOD for ctDNA MRD, we developed personalized models of ctDNA VAFs. Exponential models fit the data well in 43/44 cases with ≥ 3 MRD-detectable samples through C4D1, with a median correlation of 0.91, confirming their utility for modeling ctDNA. We generated log-linear models for 106 patients with ≥ 2 MRD-detectable samples, and used log-fold change in VAF per cycle (i.e. the slope) to compare patients by progression. Median log-fold change in VAF per cycle was worse for patients who progressed at -1.1 (IQR -1.4, -0.7) compared to those who remained disease-free with median -1.5 (IQR -2.0, -1.1) (p&lt;0.0001). Patients with primary refractory DLBCL on EOT imaging had less robust log-fold change per cycle with median of -0.9 (IQR -1.3, -0.5) compared to those who relapsed after CR with median -1.3 (IQR -1.7, -0.9) (p=0.0004). We used the distribution of slopes to project VAFs for cycles 2 to 5 for patients who progressed to determine the LODs that provide acceptable clinical sensitivity. We found the analytical sensitivity for detecting DLBCL extended lower for each cycle, demonstrating the need for a LOD at least 1 in 10 6 for robust MRD detection at late time-points during 1L treatment. Conclusion: When using an ultrasensitive assay, MRD assessment at later timepoints better predicts PFS than at early timepoints. While the technical LOD does not affect disease burden assessment before therapy, using more sensitive assays during and after therapy improves disease detection and outcome prediction. Utilizing the most sensitive ctDNA MRD assays in 1L DLBCL therapy will maximize the efficacy of MRD-driven therapeutic strategies and MRD as a surrogate endpoint in future trials.

Area Under The Receiver Operating Characteristic Research Articles

Related Topics

Articles published on Area Under The Receiver Operating Characteristic

Smoking Classification Using Novel Plasma Cytokines by implementing Machine Learning and Statistical Methods.

Utility of end-tidal carbon dioxide to guide resuscitation termination in prolonged out-of-hospital cardiac arrest

Fungal and bacterial gut microbiota differ between Clostridioides difficile colonization and infection.

Comparison of the diagnostic utility of CHOKAI, STONE and STONE PLUS scores in predicting ureteral stones larger than 5 mm

Wi-Fi Fingerprint for Indoor Keyless Entry Systems with Ensemble Learning Regression-Classification Model

MBMF: Constructing memory banks of multi‐scale features for anomaly detection

Predicting Brain Amyloid Status Using the NIH Toolbox for Assessment of Neurological and Behavioral Function (NIHTB)

Facilitating Clinical Use of the Amsterdam Instrumental Activities of Daily Living Questionnaire: Normative Data and Diagnostic Cutoff Values

AUCReshaping: improved sensitivity at high-specificity

Prognostic algorithms for post-discharge readmission and mortality among mother-infant dyads: an observational study protocol.

Application of machine learning algorithms to construct and validate a prediction model for coronary heart disease risk in patients with periodontitis: a population-based study

A Prognostic Nomogram Survival Model for Newly Diagnosed Patients with AIDS-Related Diffuse Large B-Cell Lymphoma: A Multicenter Cohort Study in China

Development of a Real-Time Dashboard Characterizing Acute Leukemia Clinical Trial Enrollment Diversity at the Practice, Investigator, and Individual Clinician Level

Use of Machine Learning to Predict 30-Day Reutilization of Care for Patients with Sickle Cell Disease Treated for Vaso-Occlusive Crisis

Harnessing Artificial Intelligence for Risk Stratification in Acute Myeloid Leukemia (AML): Evaluating the Utility of Longitudinal Electronic Health Record (EHR) Data Via Graph Neural Networks

Artificial Intelligence: A Reliable Tool to Detect the Elongation of the Styloid Process.

The Predictive Abilities of Machine Learning Algorithms in Patients with Thoracolumbar Spinal Cord Injuries

Optimizing Circulating Tumor DNA Limits of Detection for DLBCL during First Line Therapy

Predicting Major Bleeding in Patients with Venous Thromboembolism on Extended Anticoagulation Therapy Using Follow-up Data and Long Short-Term Memory Based Recurrent Neural Network

879. Host Response Classifiers Identify Infection and Illness Severity and Improve Antibiotic Decision-Making in Patients with Suspected Infections Presenting to the Emergency Department

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Area Under The Receiver Operating Characteristic Research Articles

Related Topics

Articles published on Area Under The Receiver Operating Characteristic

Smoking Classification Using Novel Plasma Cytokines by implementing Machine Learning and Statistical Methods.

Utility of end-tidal carbon dioxide to guide resuscitation termination in prolonged out-of-hospital cardiac arrest

Fungal and bacterial gut microbiota differ between Clostridioides difficile colonization and infection.

Comparison of the diagnostic utility of CHOKAI, STONE and STONE PLUS scores in predicting ureteral stones larger than 5 mm

Wi-Fi Fingerprint for Indoor Keyless Entry Systems with Ensemble Learning Regression-Classification Model

MBMF: Constructing memory banks of multi‐scale features for anomaly detection

Predicting Brain Amyloid Status Using the NIH Toolbox for Assessment of Neurological and Behavioral Function (NIHTB)

Facilitating Clinical Use of the Amsterdam Instrumental Activities of Daily Living Questionnaire: Normative Data and Diagnostic Cutoff Values

AUCReshaping: improved sensitivity at high-specificity

Prognostic algorithms for post-discharge readmission and mortality among mother-infant dyads: an observational study protocol.

Application of machine learning algorithms to construct and validate a prediction model for coronary heart disease risk in patients with periodontitis: a population-based study

A Prognostic Nomogram Survival Model for Newly Diagnosed Patients with AIDS-Related Diffuse Large B-Cell Lymphoma: A Multicenter Cohort Study in China

Development of a Real-Time Dashboard Characterizing Acute Leukemia Clinical Trial Enrollment Diversity at the Practice, Investigator, and Individual Clinician Level

Use of Machine Learning to Predict 30-Day Reutilization of Care for Patients with Sickle Cell Disease Treated for Vaso-Occlusive Crisis

Harnessing Artificial Intelligence for Risk Stratification in Acute Myeloid Leukemia (AML): Evaluating the Utility of Longitudinal Electronic Health Record (EHR) Data Via Graph Neural Networks

Artificial Intelligence: A Reliable Tool to Detect the Elongation of the Styloid Process.

The Predictive Abilities of Machine Learning Algorithms in Patients with Thoracolumbar Spinal Cord Injuries

Optimizing Circulating Tumor DNA Limits of Detection for DLBCL during First Line Therapy

Predicting Major Bleeding in Patients with Venous Thromboembolism on Extended Anticoagulation Therapy Using Follow-up Data and Long Short-Term Memory Based Recurrent Neural Network

879. Host Response Classifiers Identify Infection and Illness Severity and Improve Antibiotic Decision-Making in Patients with Suspected Infections Presenting to the Emergency Department