Evaluation of a Natural Language Processing Model to Identify and Characterize Patients in the United States With High-Risk Non-Muscle-Invasive Bladder Cancer.

Vikram M Narayan,Haojie Li,Arthur Sillah,Andrew J Mcmurry,Kentaro Imai,Vladimir Turzhitsky,Despina Siolas,Eric S Meadows

doi:10.1200/cci.23.00096

Abstract

Treatment of non-muscle-invasive bladder cancer (NMIBC) is guided by risk stratification using clinical and pathologic criteria. This study aimed to develop a natural language processing (NLP) model for identifying patients with high-risk NMIBC retrospectively from unstructured electronic medical records (EMRs) and to apply the model to describe patient and tumor characteristics. We used three independent EMR-derived data sets including adult patients with a bladder cancer diagnosis in 2011-2020 for NLP model development and training (n = 140), validation (n = 697), and application for the retrospective cohort analysis (n = 4,402). Deep learning methods were used to train NLP recognition of medical chart terminology to identify seven high-risk NMIBC criteria; model performance was assessed using the F1 score, weighted across features. An algorithm was then used to classify each patient as high-risk NMIBC (yes/no). Manually reviewed records served as the gold standard. The F1 scores after model training were >0.7 for all but one uncommon feature (prostatic urethral involvement). The highest area under the receiver operating curves (AUC) was observed for Ta (0.897) and T1 (0.897); the lowest AUC was for carcinoma in situ (CIS; 0.617). For high-risk NMIBC classification, positive predictive value was 79.4%, negative predictive value was 93.2%, and false-positive rate was 8.9%. Sensitivity and specificity were 83.7% and 91.1%, respectively. Of 748 patients manually confirmed as having high-risk NMIBC, 196 (26%) had CIS (of whom 19% also had T1 and 23% also had Ta disease); 552 tumors (74%) had no associated CIS. The NLP model, combined with a rule-based algorithm, identified high-risk NMIBC with good performance and will enable future work to study real-world treatment patterns and clinical outcomes for high-risk NMIBC.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Evaluation of a Natural Language Processing Model to Identify and Characterize Patients in the United States With High-Risk Non-Muscle-Invasive Bladder Cancer.

Abstract

Talk to us

Similar Papers

More From: JCO clinical cancer informatics

Lead the way for us

Journal: JCO clinical cancer informatics	Publication Date: Sep 1, 2023
License type: cc-by-nc-nd

Similar Papers

Identification of muscle-invasion status in bladder cancer patients using natural language processing and machine learning.
Ruixin Yang ... Amanda M De Hoedt
Journal of Clinical Oncology | VOL. 40
Ruixin Yang, et. al.Ruixin Yang ... Amanda M De Hoedt
20 Feb 2022
Journal of Clinical Oncology | VOL. 40

Bacillus Calmette-Guerin (BCG) with or without pembrolizumab (pembro) for high-risk (HR) nonmuscle invasive bladder cancer (NMIBC) that is persistent or recurrent following BCG induction: Phase III KEYNOTE-676 study.
Ashish M Kamat ... Neal D Shore
Journal of Clinical Oncology | VOL. 37
Ashish M Kamat, et. al.Ashish M Kamat ... Neal D Shore
20 May 2019
Journal of Clinical Oncology | VOL. 37

Context-Based Identification of Muscle Invasion Status in Patients With Bladder Cancer Using Natural Language Processing.
Ruixin Yang ... Florian R Schroeck
JCO Clinical Cancer Informatics | VOL. 6
Ruixin Yang, et. al.Ruixin Yang ... Florian R Schroeck
01 May 2022
JCO Clinical Cancer Informatics | VOL. 6

Measuring Adoption of Patient Priorities-Aligned Care Using Natural Language Processing of Electronic Health Records: Development and Validation of the Model.
Javad Razjouyan ... Lilian Dindo
JMIR Medical Informatics | VOL. 9
Javad Razjouyan, et. al.Javad Razjouyan ... Lilian Dindo
19 Feb 2021
JMIR Medical Informatics | VOL. 9

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Evaluation of a Natural Language Processing Model to Identify and Characterize Patients in the United States With High-Risk Non-Muscle-Invasive Bladder Cancer.

Abstract

Talk to us

Similar Papers

More From: JCO clinical cancer informatics