Abstract

PurposeDiagnostic statements for pituitary adenomas (PAs) are complex and unstandardized. We aimed to determine the most commonly used elements contained in the statements and their combination patterns and variations in real-world clinical practice, with the ultimate goal of promoting standardized diagnostic recording and establishing an efficient element extraction process.MethodsPatient medical records from 2012 to 2020 that included PA among the first three diagnoses were included. After manually labeling the elements in the diagnostic texts, we obtained element types and training sets, according to which an information extraction model was constructed based on the word segmentation model “Jieba” to extract information contained in the remaining diagnostic texts.ResultsA total of 576 different diagnostic statements from 4010 texts of 3770 medical records were enrolled in the analysis. The first ten diagnostic elements related to PA were histopathology, tumor location, endocrine status, tumor size, invasiveness, recurrence, diagnostic confirmation, Knosp grade, residual tumor, and refractoriness. The automated extraction model achieved F1-scores that reached 100% for all ten elements in the second round and 97.3–100.0% in the test set consisting of an additional 532 diagnostic texts. Tumor location, endocrine status, histopathology, and tumor size were the most commonly used elements, and diagnoses composed of the above elements were the most frequent. Endocrine status had the greatest expression variability, followed by Knosp grade. Among all the terms, the percentage of loss of tumor size was among the highest (21%). Among statements where the principal diagnoses were PAs, 18.6% did not have information on tumor size, while for those with other diagnoses, this percentage rose to 48% (P < 0.001).ConclusionStandardization of the diagnostic statement for PAs is unsatisfactory in real-world clinical practice. This study could help standardize a structured pattern for PA diagnosis and establish a foundation for research-friendly, high-quality clinical information extraction.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.