Abstract

e13591 Background: Relapse is a major concern for oncologists and breast cancer survivors that necessitates additional treatment and often leads to mortality. Cancer registries routinely track cancer mortality, but few monitor for relapse because of logistical challenges and prohibitive costs. In this context, Natural Language Processing (NLP) is a promising tool. Merging artificial intelligence with linguistics, NLP can rapidly analyze vast volumes of text in electronic health records. This capability of NLP is particularly valuable for Computed Tomography (CT) scans used in breast cancer care. CT scans are routinely used to characterize breast cancer progression and are described in transcribed dictations by radiologists. We aimed to apply NLP to these text reports to identify breast cancer relapses. Objective: To automate breast cancer relapse detection and classification in CT text reports using NLP. Methods: We analyzed 1,445 CT text reports from patients diagnosed with breast cancer between January 1, 2005, and December 31, 2014. These reports underwent manual review by trained human annotators. Text was annotated to identify terminology defining local, regional, and distant breast cancer relapses. Annotated reports were partitioned into a training-validation set (90% cohort) and a test set (10% cohort) for NLP model development. Results: In our dataset of 1,445 CT text reports, 72 (5.0%) were classified as local relapse, 97 (6.7%) as regional relapse, and 743 (51.4%) as distant relapse. The performance of our NLP model using the training-validation dataset can be summarized by the following metrics and 95% confidence intervals: 94% (±3.2) accuracy for detection and 96% (±2.9) accuracy for classification. The performance of our NLP model was confirmed using the test dataset, with 90% (±4.5) accuracy for detection and 91% (±6.3) accuracy for classification. For reference, all metrics are outlined (Table). Conclusions: Our model for identifying regional and distant relapses in CT reports had excellent performance, but had lower sensitivity for local relapses, posing a risk of false negatives. Automating the identification and classification of breast cancer relapses, if used retrospectively, can enhance cancer registry data about patient outcomes and, if used prospectively, holds the potential for enhancing patient care. [Table: see text]

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call