The increasing adoption of electronic medical records (EMRs) presents a unique opportunity to enhance trauma care through data-driven insights. However, extracting meaningful and actionable information from unstructured clinical text remains a significant challenge. Addressing this gap, this study focuses on the application of natural language processing (NLP) techniques to extract injury-related variables and classify trauma patients based on the presence of loss of consciousness (LOC). A dataset of 23,308 trauma patient EMRs, including pre-diagnosis and post-diagnosis free-text notes, was analyzed using a bilingual (English and Korean) pre-trained RoBERTa model. The patients were categorized into four groups based on the presence of LOC and head trauma. To address class imbalance in LOC labeling, deep learning models were trained with weighted loss functions, achieving a high area under the curve (AUC) of 0.91. Local Interpretable Model-agnostic Explanations analysis further demonstrated the model’s ability to identify critical terms related to head injuries and consciousness. NLP can effectively identify LOC in trauma patients’ EMRs, with weighted loss functions addressing data imbalances. These findings can inform the development of AI tools to improve trauma care and decision-making.
Read full abstract