Abstract

Fall-related injuries (FRIs) are a major cause of hospitalizations among older patients, but identifying them in unstructured clinical notes poses challenges for large-scale research. In this study, we developed and evaluated Natural Language Processing (NLP) models to address this issue. We utilized all available clinical notes from the Mass General Brigham for 2,100 older adults, identifying 154,949 paragraphs of interest through automatic scanning for FRI-related keywords. Two clinical experts directly labeled 5,000 paragraphs to generate benchmark-standard labels, while 3,689 validated patterns were annotated, indirectly labeling 93,157 paragraphs as validated-standard labels. Five NLP models, including vanilla BERT, RoBERTa, Clinical-BERT, Distil-BERT, and SVM, were trained using 2,000 benchmark paragraphs and all validated paragraphs. BERT-based models were trained in three stages: Masked Language Modeling, General Boolean Question Answering (QA), and QA for FRI. For validation, 500 benchmark paragraphs were used, and the remaining 2,500 for testing. Performance metrics (precision, recall, F1 scores, Area Under ROC [AUROC] or Precision-Recall [AUPR] curves) were employed by comparison, with RoBERTa showing the best performance. Precision was 0.90 [0.88-0.91], recall [0.90-0.93], F1 score 0.90 [0.89-0.92], AUROC and AUPR curves of 0.96 [0.95-0.97]. These NLP models accurately identify FRIs from unstructured clinical notes, potentially enhancing clinical notes-based research efficiency.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.