Abstract

Introduction: We aimed to identify and characterize carotid artery stenosis (CAS), defined as ≥40% stenosis in the internal carotid artery (ICA) or the common carotid artery (CCA), among participants of the Mayo Vascular Disease Biorepository (VDB, n = 11,814). Hypothesis: A string-based text processing (TP) tool coded in R can analyze radiology text reports to ascertain and characterize CAS. Methods: After importing a radiology report into R, it was segmented for analysis and each segment was tested for 22 unique features. The TP tool was able to i) identify stenosis in the ICA or the CCA while excluding stenosis of vertebral arteries, subclavian arteries, or external carotid arteries, ii) recognize both quantitative and qualitative descriptions of CAS, iii) interpret numerical expressions by considering the surrounding text, iv) detect negations, and v) ascertain previous revascularization procedures if mentioned in the text. We manually ascertained CAS and characterized it in 800 randomly selected radiology reports to assess the performance of the TP tool in terms of precision, recall, and F-score. Results: We retrieved 21,651 imaging reports for 6,342 unique patients including 20,461 ultrasonography reports, 812 magnetic resonance imaging reports, 357 computed tomography scan reports, and 21interventional radiology reports. The performance of TP is outlined in Table 1 . Conclusion: We demonstrate that a string-based TP tool coded in R is efficient and accurate in identifying and characterizing CAS from carotid radiology reports. The tool does not require natural language processing and could be used in a wide range of settings for genetic and biomarker studies.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call