Abstract

PurposeBladder cancer is initially diagnosed and staged with a transurethral resection of bladder tumor (TURBT). Patient survival is dependent on appropriate sampling of layers of the bladder, but pathology reports are dictated as free text, making large-scale data extraction for quality improvement challenging. We sought to automate extraction of stage, grade, and quality information from TURBT pathology reports using natural language processing (NLP).MethodsPatients undergoing TURBT were retrospectively identified using the Northwestern Enterprise Data Warehouse. An NLP algorithm was then created to extract information from free-text pathology reports and was iteratively improved using a training set of manually reviewed TURBTs. NLP accuracy was then validated using another set of manually reviewed TURBTs, and reliability was calculated using Cohen’s κ.ResultsOf 3,042 TURBTs identified from 2006 to 2016, 39% were classified as benign, 35% as Ta, 11% as T1, 4% as T2, and 10% as isolated carcinoma in situ...

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call