Abstract
Radiological measurements are reported in free text reports, and it is challenging to extract such measures for treatment planning such as lesion summarization and cancer response assessment. The purpose of this work is to develop and evaluate a natural language processing (NLP) pipeline that can extract measurements and their core descriptors, such as temporality, anatomical entity, imaging observation, RadLex descriptors, series number, image number, and segment from a wide variety of radiology reports (MR, CT, and mammogram). We created a hybrid NLP pipeline that integrates rule-based feature extraction modules and conditional random field (CRF) model for extraction of the measurements from the radiology reports and links them with clinically relevant features such as anatomical entities or imaging observations. The pipeline was trained on 1117 CT/MR reports, and performance of the system was evaluated on an independent set of 100 expert-annotated CT/MR reports and also tested on 25 mammography reports. The system detected 813 out of 806 measurements in the CT/MR reports; 784 were true positives, 29 were false positives, and 0 were false negatives. Similarly, from the mammography reports, 96% of the measurements with their modifiers were extracted correctly. Our approach could enable the development of computerized applications that can utilize summarized lesion measurements from radiology report of varying modalities and improve practice by tracking the same lesions along multiple radiologic encounters.
Highlights
Radiology reports include a great variety of information about normal and abnormal structures in a free text format
Notwithstanding the foregoing limitations and challenges, we believe there is potential for clinical utility of our approach to improve radiologist practice by enabling automatic measurement extraction and summarization from radiology reports
Measurement and auxiliary regular expressions are used same as they are defined by Sevenster et al [14]. m specifies the measurement regular expression to match; using other regular expressions listed under Measurement. x specifies the numerical part of the measurement whereas cm specifies the unit
Summary
Radiology reports include a great variety of information about normal and abnormal structures in a free text format. For cancer patients, radiology reports describe measurements of cancer lesions and interval changes in their size are crucial indicators of response or resistance of cancer therapies. Measurements of lesion size (as well as organ size) are the predominant type of quantitative data recorded within the radiology reports. Unlike other numerical, phenotypic evidence, such as lab values in ED notes, measurements are recorded as free text, which hampers extraction and utilization of such data by computer applications. Radiologists and clinicians need to ferret out lesion measurements from the radiology report for assessing changes in the tumor burden. In addition to measurement reporting, usage of different templates in general radiology reporting obstructs the automatic information extraction tasks
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.