Abstract

1524 Background: Identifying eligible patients for oncology clinical trials (“prescreening”) relies on manual chart review by clinical research coordinators (CRCs), which is time-consuming and often inaccurate. Consequently, 70% of patients with cancer who meet trial eligibility criteria are not offered participation. Natural language processing (NLP) may improve the accuracy and timeliness of prescreening. Methods: This was a preplanned interim analysis of a paired-design noninferiority trial, powered to assess timeliness. We adapted NLP algorithms to identify 13 common eligibility criteria related to cancer type/stage, prior systemic therapy, actionable biomarkers, and response criteria. NLP systems performed optical character recognition, entity and relationship extraction, and natural language inference through deep learning and symbolic AI techniques. Deidentified unstructured electronic health records (EHRs) from real-world patients with non-small cell lung (NSCLC) or colorectal (CrCa) cancer were presented to CRCs via a secure platform. Two CRCs were randomized 1:1 to view blocks of 20 charts each with (Human+AI) or without (Human-alone) NLP annotations. In pre-trial assessment, the CRCs had 86% inter-rater agreement. The primary outcome was overall chart-level accuracy, defined as percent of the 13 CRC-coded eligibility items matching a gold standard set, determined by 2-3 clinicians blinded to experimental arms. Paired t-tests compared overall accuracy (noninferiority margin 5%) and criteria-specific accuracy. Significant differences were assessed for superiority using two-sided paired t-tests. Mann-Whitney tests compared the secondary outcome of timeliness (time per chart review). Results: Among 74 (40 NSCLC; 34 CrCa) of a planned 400 patients, overall accuracy for Human+AI was noninferior to Human-alone (78.7% vs. 76.7%, mean difference 2.0%, p<0.001 rejecting inferiority); both were greater than AI-alone (63.5%). Superiority for Human+AI was demonstrated for RECIST response, but not for overall nor other criteria-specific accuracy (Table). Median time per review was lower for Human+AI than Human-alone (34.1 vs 43.9 min, adjusted p=0.05). Conclusions: Human+AI teams can improve timeliness of trial prescreening with noninferior accuracy. This platform is being used for eligibility assessment in an ongoing clinical trial. [Table: see text]

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call