Novel approach to implementing natural language processing for clinical staging of non-small-cell lung cancer.

Manan P Shah,Jason Wang,John A Glaspy,Laura Prichard,Alex V Gelvezon,Amy Lauren Cummings,Vu Vu

doi:10.1200/jco.2024.42.16_suppl.e13624

Abstract

e13624 Background: Cancer staging is instrumental in driving clinical management and trial enrollment, but staging data are generally unreliable and unstructured in the electronic health record (EHR). Advances in natural language processing (NLP) may facilitate clinical staging and documentation [1], but challenges to real-world implementation include (1) automatically identifying appropriate patients and reports from the EHR and (2) developing an unbiased dataset for training and validation [2]. We describe our institution’s novel approach to overcome these barriers while building an in-house NLP pipeline for clinical tumor staging of non-small-cell lung cancer (NSCLC). Methods: We identified patients by searching our EHR (Epic) for a molecular analysis test ordered specifically for pathological diagnoses of NSCLC at our institution. We used the test order date as the diagnosis proxy date (DPD). For each patient, we extracted imaging reports up to 16 weeks before and 6 weeks after the DPD. To derive primary tumor size, we analyzed the CT Chest or PET/CT report closest to the DPD using an oncology-trained NLP text extraction and labeling tool (John Snow Labs). We cleaned all extracted tumor size entities and identified the largest measurement linked to the lungs. We compared primary tumor measurements from the NLP pipeline to those in a preexisting, manually compiled cancer registry (CNEXT). We manually analyzed discrepancies through chart review. Results: 542 patients with a DPD between 11/2016 - 9/2023 were processed through the NLP pipeline. Of 443 patients with valid values in both the pipeline and CNEXT, 53% (234) were exact matches, and 20% (90) had a close match (within 0-5mm), yielding a 73% accuracy rate for values within 5mm. When mismatched values were manually reviewed, several cases in CNEXT were found to have a DPD differing by more than 3 months and tumor sizes derived from external reports. When these cases were excluded, 320 of the remaining 349 patients had valid values in both the pipeline and the updated manual review. In this refined population, 66% (213) were exact matches, and 15% (48) had a close match, yielding an 82% accuracy rate for values within 5mm. Conclusions: To our knowledge, this is the first report of a pathology-based method to automatically and reliably identify patients with NSCLC and their relevant imaging reports directly from the EHR. We used a prebuilt NLP tool to derive primary tumor sizes with relatively high accuracy and found that adding flags for timeline discrepancies and external reports can further improve validity. As we near completion of analogous pipelines for node and metastasis staging, we will develop methodology to identify subgroups of patients that can be clinically staged with near-perfect accuracy, ultimately aiming to substantially limit manual staging of uncomplicated cases. 1. Puts 2023. 2. Wang 2022.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Novel approach to implementing natural language processing for clinical staging of non-small-cell lung cancer.

Abstract

Talk to us

Similar Papers

More From: Journal of Clinical Oncology

Lead the way for us

Similar Papers

Extracting longitudinal anticancer treatments at scale using deep natural language processing and temporal reasoning.
Meng Ma ...
Journal of Clinical Oncology | VOL. 39
Meng Ma, et. al.Meng Ma ...
20 May 2021
Journal of Clinical Oncology | VOL. 39

Surgery for Early-Stage Non-Small Cell Lung Cancer: A Systematic Review of the Video-Assisted Thoracoscopic Surgery Versus Thoracotomy Approaches to Lobectomy
Bryan A Whitson ... Michael A Maddaus
The Annals of Thoracic Surgery | VOL. 86
Bryan A Whitson, et. al.Bryan A Whitson ... Michael A Maddaus
18 Nov 2008
The Annals of Thoracic Surgery | VOL. 86

Molecular Biologic Staging of Lung Cancer
Thomas A D’Amico
The Annals of Thoracic Surgery | VOL. 85
Thomas A D’AmicoThomas A D’Amico
24 Jan 2008
The Annals of Thoracic Surgery | VOL. 85

Deployment of Real-time Natural Language Processing and Deep Learning Clinical Decision Support in the Electronic Health Record: Pipeline Implementation for an Opioid Misuse Screener in Hospitalized Adults.
Majid Afshar ... Marlon P Mundt
JMIR medical informatics | VOL. 11
Majid Afshar, et. al.Majid Afshar ... Marlon P Mundt
20 Apr 2023
JMIR medical informatics | VOL. 11

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Novel approach to implementing natural language processing for clinical staging of non-small-cell lung cancer.

Abstract

Talk to us

Similar Papers

More From: Journal of Clinical Oncology