Abstract

To demonstrate the utility of a natural language processing (NLP) algorithm for mining kidney stone composition in a large-scale electronic health records (EHR) repository. We developed StoneX, a pattern-matching method for extracting kidney stone composition information from clinical notes. We trained the extraction algorithm on manually annotated text mentions of calcium oxalate monohydrate, calcium oxalate dihydrate, hydroxyapatite, brushite, uric acid, and struvite stones. We employed StoneX to identify patients with kidney stone composition data and mine >125 million notes from our institutional EHR. Analyses performed on the extracted patients included stone type conversions over time, survival analysis from a second stone surgery, and disease associations by stone composition to validate the phenotyping method against known associations. The NLP algorithm identified 45,235 text mentions corresponding to 11,585 patients. Overall, the system achieved positive predictive value >90% for calcium oxalate monohydrate, calcium oxalate dihydrate, hydroxyapatite, brushite, and struvite; except for uric acid (positive predictive value = 87.5%). Survival analysis from a second stone surgery showed statistically significant differences among stone types (P = .03). Several phenotype associations were found: uric acid-type 2 diabetes (odds ratio, OR = 2.69, 95% confidence intervals, CI = 1.91-3.79), struvite-neurogenic bladder (OR = 12.27, 95% CI = 4.33-34.79), struvite-urinary tract infection (OR = 7.36, 95% CI = 3.01-17.99), hydroxyapatite-pulmonary collapse (OR = 3.67, 95% CI = 2.10-6.42), hydroxyapatite-neurogenic bladder (OR = 5.23, 95% CI = 2.05-13.36), brushite-calcium metabolism disorder (OR = 4.59, 95% CI = 2.14-9.81), and brushite-hypercalcemia (OR = 4.09, 95% CI = 1.90-8.80). NLP extraction of kidney stone composition from large-scale EHRs is feasible with high precision, enabling high-throughput epidemiological studies of kidney stone disease. These tools will enable high fidelity kidney stone research from the EHR.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.