Sarcopenia (low muscle mass and strength) causes dysmobility and loss of independence. Sarcopenia is often not directly coded or described in electronic health records (EHR). The objective was to improve sarcopenia detection using structured data from EHR. Adults undergoing musculoskeletal testing (December 2017-March 2020) were classified as meeting sarcopenia thresholds for 0 (controls), ≥1 (Sarcopenia-1), or ≥2 (Sarcopenia-2) tests. Electronic health record diagnoses, medications, and laboratory testing were extracted from the Indiana Network for Patient Care. Five machine learning models were applied to EHR data for predicting sarcopenia. Of 1304 participants, 1055 were controls, 249 met Sarcopenia-1 and 76 met Sarcopenia-2. Sarcopenic participants were older, with higher fat mass, Charlson Comorbidity Index, and more chronic diseases. All models performed better for Sarcopenia-2 than Sarcopenia-1. The top performing models for Sarcopenia-1 were Logistic Regression [area under the curve (AUC) 71.59 (95% confidence interval [CI], 71.51-71.66)] and Multi-Layer Perceptron [AUC 71.48 (95%CI, 71.00-71.97)]. The top performing models for Sarcopenia-2 were Logistic Regression [AUC 91.44 (95%CI, 91.28-91.60)] and Support Vector Machine [AUC 90.81 (95%CI, 88.41-93.20)]. For the best Logistic Regression Model, important sarcopenia predictors included diabetes mellitus, digestive system complaints, signs and symptoms involving the nervous, musculoskeletal and respiratory systems, metabolic disorders, and kidney or urinary tract disorders. Opioids, corticosteroids, and antihyperlipidemic drugs were also more common among sarcopenic participants. Applying machine learning models, sarcopenia can be predicted from structured data in EHR, which may be developed through future studies to facilitate large-scale early detection and intervention in clinical populations.
Read full abstract