Abstract

ABSTRACTThe prediction of fracture risk in osteoporotic patients has been a topic of interest for decades, and models have been developed for the accurate prediction of fracture, including the fracture risk assessment tool (FRAX). As machine‐learning methodologies have recently emerged as a potential model for medical prediction tools, we aimed to develop a novel fracture prediction model using machine‐learning methods in a prospective community‐based cohort. In this study, 2227 participants (1257 females) with a baseline bone mineral density (BMD) and trabecular bone score were enrolled from the Ansung cohort. The primary endpoint was the fragility fractures reported by patients or confirmed by X‐rays. We used 3 different models: CatBoost, support vector machine (SVM), and logistic regression. During a mean 7.5‐year follow‐up (range, 2.5 to 10 years), fragility fractures occurred in 537 (25.6%) of participants. In predicting total fragility fractures, the area under the curve (AUC) values of the CatBoost, SVM, and logistic regression models were 0.688, 0.500, and 0.614, respectively. The AUC value of CatBoost was significantly better than that of FRAX (0.663; p < 0.001), whereas the the SVM and logistic regression models were not. Compared with the conventional models such as SVM and logistic regression, the CatBoost model had the best performance in predicting total fragility fractures (p < 0.001). According to feature importance in the CatBoost model, the top predicting factors (listed in order) were total hip, lumbar spine, and femur neck BMD, subjective arthralgia score, serum creatinine, and homocysteine. The latter three factors were listed higher than conventional predictors such as age or previous fracture history. In summary, we hereby report the development of a prediction model for fragility fractures using a machine‐learning method, CatBoost, which outperforms the FRAX model as well as two conventional machine‐learning models. The model was also able to propose novel high‐ranking predictors. © 2020 The Authors. JBMR Plus published by Wiley Periodicals, Inc. on behalf of American Society for Bone and Mineral Research.

Highlights

  • Fragility fracture has become a major socioeconomic issue in an aging society

  • Machine-learning methodologies have emerged in medical prediction models, especially in cardiovascular disease.[5,6] In a similar way, this new approach might improve the performance of current fracture prediction models by including all possible variables such as the bone mineral density (BMD) of all sites as well as trabecular bone score (TBS) data

  • We aimed to develop a prediction model of fragility fractures and discover novel risk factors using a machinelearning method in a large-sized longitudinal community-based cohort study

Read more

Summary

Introduction

Fragility fracture has become a major socioeconomic issue in an aging society. The incidence of osteoporosis has been reported to be 12.9% in men and 24.0% in women over 50 years of age, and the frequency of osteoporotic fractures is continuously increasing by an annual average of 15.2% in Korea.[1]. Fragility fracture and its socioeconomic costs increase along with the incidence of osteoporosis,(1) which makes the prediction and prevention of particular importance currently. This new approach might improve the performance of current fracture prediction models by including all possible variables such as the BMD of all sites as well as trabecular bone score (TBS) data. There are a few studies in osteoporosis and fracture prediction using machine learning,(7–9) a fracture-prediction machine-learning model with a longitudinal, large-sized cohort study including BMD and TBS has not been developed. There are various machine-learning techniques such as support vector machine (SVM), and gradient boosting models like XGboost and CatBoost (for “categorical boosting”). Gradient boosting is a powerful machine-learning technique typically used in developing decision trees, which could be done without extensive data training like other machine-learning techniques. Among the gradient boosting techniques, CatBoost is the most recently developed model with excellent performance, which can handle categorical features without preprocessing to lower the chances of overfitting to make more generalized models.[10]

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call