Abstract

2043 Background: Electronic health records (EHR) are used for retrospective cancer outcomes analysis. Sites and timing of recurrence are not captured in structured EHR data. Novel computerized methods are necessary to use unstructured longitudinal EHR data for large scale studies. Methods: We previously developed a neural network-based NLP algorithm to identify no recurrence vs. metastatic recurrence cases by analyzing physician notes, pathology and radiology reports in Stanford’s breast cancer database, Oncoshare (Cohort A). To validate this algorithm for local vs. distant recurrence, we identified a distinct Oncoshare cohort (Cohort B). Cases were manually curated for longitudinal development of local or distant recurrence and metastatic sites. A two-sided t-test was used to compare mean probabilities between local and distant recurrence cases. Next, we combined cases in Cohorts A and B to train and validate a novel NLP classifier that identifies metastatic site. The combined cohort was randomly divided into training and validation sets. Sensitivity and specificity were calculated for the NLP algorithm’s ability to detect metastatic sites compared to manual curation. Results: In Cohort B: 350 metastatic cases were identified. Mean probability for local and distant recurrence was 0.43 and 0.79, respectively and differed significantly for patients with local vs. distant recurrence (p<0.01). In Cohorts A and B: 632 metastatic cases were used for determination of sites. Sensitivity and specificity were highest for detection of peritoneal metastasis followed by liver, lung, skin, bone and central nervous system (table). Conclusions: This NLP algorithm is a scalable tool that uses unstructured EHR data to capture breast cancer recurrence, distinguishing local from distant recurrence and identifying metastatic site. This method may facilitate analysis of large datasets and correlation of outcomes with metastatic site. [Table: see text]

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call