Development of an algorithm using natural language processing to identify metastatic breast cancer patients from clinical notes.

Krishna Kumar Swaminathan,Emma Mendonca,Karpagavalli Thirumalai,Babu Narayanan,Pranay Mukherjee,Rachel Newsome

doi:10.1200/jco.2020.38.15_suppl.e14056

Abstract

e14056 Background: Determination of the metastatic status of a patient is important for outcomes research and candidacy for clinical trials. Structured data in EMR may not always capture the metastatic status, and it is useful to extract it automatically from physician notes. Contextual understanding of the notes is important to resolve issues such as a) local vs distal metastasis b) statements involving family history of metastasis or physician instructing the patient to look for certain signs of metastasis c) text indicating suspicion of metastasis or absence of metastasis d) indirect utterances, e.g. cancer has spread to the bone. e) corrections to previous findings. Methods: We used a set of 20138 breast cancer patients from Concerto HealthAI real world oncology dataset that includes data from CancerLinQ Discovery to build & validate the set of NLP algorithms. 5300 sentences from 1500 patients were annotated & algorithms manually validated by data abstractors for 500 patients. The algorithms developed were the following: 1) Classification of a sentence into 3 classes: Distal/Local metastasis, Suspicious & Other 2) Classification of a sentence into 2 classes: Distal or Local 3) Classification of a patient into 2 classes: Distal metastasis or not distal metastasis 4) Multi label classification for detecting sites of metastasis. Sentence level algorithms were built using Deep Learning and patient level aggregation of sentence level prediction was done using ML approaches including temporal features. Pretrained ULMFiT model was fine-tuned with Concerto HealthAI’s corpus for sentence classification tasks. Results: At a sentence level, we obtained an accuracy of 0.85 for the distal/local vs suspicious vs irrelevant model and 0.97 for the distal vs not distal metastasis model. Our patient level metrics are shown in the table. The classes used for sites of metastasis are Brain, Bone, Lung, Liver, Distant Lymph nodes & Unknown sites. Subset accuracy (mean fraction of labels which match ) of 0.93 was obtained on the hold out test set at patient level. Conclusions: Metastatic status & site of metastasis can be reliably extracted automatically from clinical notes using deep learning techniques. This information will be valuable for clinical trial matching, outcomes research and other applications. [Table: see text]

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Development of an algorithm using natural language processing to identify metastatic breast cancer patients from clinical notes.

Abstract

Talk to us

Similar Papers

More From: Journal of Clinical Oncology

Lead the way for us

Journal: Journal of Clinical Oncology	Publication Date: May 20, 2020
Citations: 1

Similar Papers

Prognostic value of site-specific metastases and therapeutic roles of surgery for patients with metastatic bladder cancer: a population-based study.
Fan Dong ... Xianjin Wang
Cancer Management and Research | VOL. 9
Fan Dong, et. al.Fan Dong ... Xianjin Wang
01 Nov 2017
Cancer Management and Research | VOL. 9

Prognostic value of site-specific metastases in pancreatic adenocarcinoma: A Surveillance Epidemiology and End Results database analysis.
Hani Oweira ... Othmar Schöb
World Journal of Gastroenterology | VOL. 23
Hani Oweira, et. al.Hani Oweira ... Othmar Schöb
01 Jan 2017
World Journal of Gastroenterology | VOL. 23

Sites of metastasis and overall survival in esophageal cancer: a population-based study.
San-Gang Wu ... Ling Guo
Cancer Management and Research | VOL. 9
San-Gang Wu, et. al.San-Gang Wu ... Ling Guo
01 Dec 2017
Cancer Management and Research | VOL. 9

Sites of distant metastases and overall survival in ovarian cancer: A study of 1481 patients
Kui Deng ... Yan Hou
Gynecologic Oncology | VOL. 150
Kui Deng, et. al.Kui Deng ... Yan Hou
09 Jul 2018
Sites of distant metastases and overall survival in ovarian cancer: A study of 1481 patients
Kui Deng ... Yan Hou

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Development of an algorithm using natural language processing to identify metastatic breast cancer patients from clinical notes.

Abstract

Talk to us

Similar Papers

More From: Journal of Clinical Oncology