Extracting longitudinal anticancer treatments at scale using deep natural language processing and temporal reasoning.

Meng Ma,Christopher Gilman,Xiaoyan Wang,Minghao Li,Tommy Mullaney,Arielle Redfern,Rong Chen,Tony Prentice,Yun Mai,Mingwei Zhang,Zongzhi Liu,Kyeryoung Lee,Qi Pan,Paul Mcdonagh ,Eric E Schadt

doi:10.1200/jco.2021.39.15_suppl.e18747

Abstract

e18747 Background: Accurate longitudinal cancer treatments are vital for establishing primary endpoints such as outcome as well as for the investigation of adverse events. However, many longitudinal therapeutic regimens are not well captured in structured electronic health records (EHRs). Thus, their recognition in unstructured data such as clinical notes is critical to gain an accurate description of the real-world patient treatment journey. Here, we demonstrate a scalable approach to extract high-quality longitudinal cancer treatments from lung cancer patients' clinical notes using a Bidirectional Long Short Term Memory (BiLSTM) and Conditional Random Fields (CRF) based natural language processing (NLP) pipeline. Methods: The lung cancer (LC) cohort of 4,698 patients was curated from the Mount Sinai Healthcare system (2003-2020). Two domain experts developed a structured framework of entities and semantics that captured treatment and its temporality. The framework included therapy type (chemotherapy, targeted therapy, immunotherapy, etc.), status (on, off, hold, planned, etc.) and temporal reasoning entities and relations (admin_date, duration, etc.) We pre-annotated 149 FDA-approved cancer drugs and longitudinal timelines of treatment on the training corpus. A NLP pipeline was implemented with BiLSTM-CRF-based deep learning models to train and then apply the resulting models to the clinical notes of LC cohort. A postprocessor was developed to subsequently post-coordinate and refine the output. We performed both cross-evaluation and independent evaluation to assess the pipeline performance. Results: We applied the NLP pipeline to the 853,755 clinical notes, and identified 1,155 distinct entities for 194 cancer generic drugs, including 74 chemotherapy drugs, 21 immunotherapy drugs, and 99 targeted therapy drugs. We identified chemotherapy, immunotherapy, or targeted therapy data for 3,509 patients in the LC cohort from the clinical notes. Compared to only 2,395 patients with cancer treatments in structured EHR, this pipeline identified cancer treatments from notes for additional 2,303 patients who did not have any available cancer treatment data in the structured EHR. Our evaluation schema indicates that the longitudinal cancer drug recognition pipeline delivers strong performance (named entity recognization for drugs and temporal: F1 = 95%; drug-temporal relation recognition: F1 = 90%). Conclusions: We developed a high-performance BiLSTM-CRF based NLP pipeline to recognize longitudinal cancer treatments. The pipeline recovers and encodes as twice as many patients with cancer treatments compared with structured EHR. Our study indicates deep NLP with temporal reasoning could substantially accelerate the extraction of treatment profiles at scale. The pipeline is adjustable and can be applied across different cancers.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Extracting longitudinal anticancer treatments at scale using deep natural language processing and temporal reasoning.

Abstract

Talk to us

Similar Papers

More From: Journal of Clinical Oncology

Lead the way for us

Similar Papers

Detecting Ground Glass Opacity Features in Patients With Lung Cancer: Automated Extraction and Longitudinal Analysis via Deep Learning-Based Natural Language Processing.
Kyeryoung Lee ... Tongyu Wang
JMIR AI | VOL. 2
Kyeryoung Lee, et. al.Kyeryoung Lee ... Tongyu Wang
01 Jun 2023
JMIR AI | VOL. 2

Identification of Preanesthetic History Elements by a Natural Language Processing Engine.
Harrison S Suh ... Rodney A Gabriel
Anesthesia & Analgesia | VOL. 135
Harrison S Suh, et. al.Harrison S Suh ... Rodney A Gabriel
15 Jul 2022
Anesthesia & Analgesia | VOL. 135

Cost-efficient quality assurance of natural language processing tools through continuous monitoring with continuous integration
Marc Schreiber ... Bodo Kraft
-
Marc Schreiber, et. al.Marc Schreiber ... Bodo Kraft
14 May 2016
14 May 2016

Deployment of Real-time Natural Language Processing and Deep Learning Clinical Decision Support in the Electronic Health Record: Pipeline Implementation for an Opioid Misuse Screener in Hospitalized Adults.
Majid Afshar ... Marlon P Mundt
JMIR medical informatics | VOL. 11
Majid Afshar, et. al.Majid Afshar ... Marlon P Mundt
20 Apr 2023
JMIR medical informatics | VOL. 11

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Extracting longitudinal anticancer treatments at scale using deep natural language processing and temporal reasoning.

Abstract

Talk to us

Similar Papers

More From: Journal of Clinical Oncology