The diagnosis of sarcopenia relies extensively on human and equipment resources and requires individuals to personally visit medical institutions. The objective of this study was to develop a test-free, self-assessable approach to identify sarcopenia by utilizing artificial intelligence techniques and representative real-world data. This multicentre study enrolled 11 661 middle-aged and older adults from a national survey initialized in 2011. Follow-up data from the baseline cohort collected in 2013 (n = 9403) and 2015 (n = 10 356) were used for validation. Sarcopenia was retrospectively diagnosed using the Asian Working Group for Sarcopenia 2019 framework. Baseline age, sex, height, weight and 20 functional capacity (FC)-related binary indices (activities of daily living = 6, instrumental activities of daily living = 5 and other FC indices = 9) were considered as predictors. Multiple machine learning (ML) models were trained and cross-validated using 70% of the baseline data to predict sarcopenia. The remaining 30% of the baseline data, along with two follow-up datasets (n = 9403 and n = 10 356, respectively), were used to assess model performance. The study included 5634 men and 6027 women (median age = 57.0 years). Sarcopenia was identified in 1288 (11.0%) individuals. Among the 20 FC indices, the running/jogging 1 km item showed the highest predictive value for sarcopenia (AUC [95%CI] = 0.633 [0.620-0.647]). From the various ML models assessed, a 24-variable gradient boosting classifier (GBC) model was selected. This GBC model demonstrated favourable performance in predicting sarcopenia in the holdout data (AUC [95%CI] = 0.831 [0.808-0.853], accuracy = 0.889, recall = 0.441, precision = 0.475, F1 score = 0.458, Kappa = 0.396 and Matthews correlation coefficient = 0.396). Further model validation on the temporal scale using two longitudinal datasets also demonstrated good performance (AUC [95%CI]: 0.833 [0.818-0.848] and 0.852 [0.840-0.865], respectively). The model's built-in feature importance ranking and the SHapley Additive exPlanations method revealed that lifting 5 kg and running/jogging 1 km were relatively important variables among the 20 FC items contributing to the model's predictive capacity, respectively. The calibration curve of the model indicated good agreement between predictions and actual observations (Hosmer and Lemeshow p = 0.501, 0.451 and 0.374 for the three test sets, respectively), and decision curve analysis supported its clinical usefulness. The model was implemented as an online web application and exported as a deployable binary file, allowing for flexible, individualized risk assessment. We developed an artificial intelligence model that can assist in the identification of sarcopenia, particularly in settings lacking the necessary resources for a comprehensive diagnosis. These findings offer potential for improving decision-making and facilitating the development of novel management strategies of sarcopenia.
Read full abstract