Abstract

Electronic health record textual sources such as medication signeturs (sigs) contain valuable information that is not always available in structured form. Commonly processed through manual annotation, this repetitive and time-consuming task could be fully automated using large language models (LLMs). While most sigs include simple instructions, some include complex patterns. We aimed to compare the performance of GPT-3.5 and GPT-4 with smaller fine-tuned models (ClinicalBERT, BlueBERT) in extracting the average daily dose of 2 immunomodulating medications with frequent complex sigs: hydroxychloroquine, and prednisone. Using manually annotated sigs as the gold standard, we compared the performance of these models in 702 hydroxychloroquine and 22104 prednisone prescriptions. GPT-4 vastly outperformed all other models for this task at any level of in-context learning. With 100 in-context examples, the model correctly annotates 94% of hydroxychloroquine and 95% of prednisone sigs to within 1 significant digit. Error analysis conducted by 2 additional manual annotators on annotator-model disagreements suggests that the vast majority of disagreements are model errors. Many model errors relate to ambiguous sigs on which there was also frequent annotator disagreement. Paired with minimal manual annotation, GPT-4 achieved excellent performance for language regression of complex medication sigs and vastly outperforms GPT-3.5, ClinicalBERT, and BlueBERT. However, the number of in-context examples needed to reach maximum performance was similar to GPT-3.5. LLMs show great potential to rapidly extract structured data from sigs in no-code fashion for clinical and research applications.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.