Abstract

The study aimed to utilize machine learning (ML) approaches and genomic data to develop a prediction model for bone mineral density (BMD) and identify the best modeling approach for BMD prediction. The genomic and phenotypic data of Osteoporotic Fractures in Men Study (n = 5130) was analyzed. Genetic risk score (GRS) was calculated from 1103 associated SNPs for each participant after a comprehensive genotype imputation. Data were normalized and divided into a training set (80%) and a validation set (20%) for analysis. Random forest, gradient boosting, neural network, and linear regression were used to develop BMD prediction models separately. Ten-fold cross-validation was used for hyper-parameters optimization. Mean square error and mean absolute error were used to assess model performance. When using GRS and phenotypic covariates as the predictors, all ML models’ performance and linear regression in BMD prediction were similar. However, when replacing GRS with the 1103 individual SNPs in the model, ML models performed significantly better than linear regression (with lasso regularization), and the gradient boosting model performed the best. Our study suggested that ML models, especially gradient boosting, can improve BMD prediction in genomic data.

Highlights

  • The study aimed to utilize machine learning (ML) approaches and genomic data to develop a prediction model for bone mineral density (BMD) and identify the best modeling approach for BMD prediction

  • 12%, 12%, and 13% of the 1103 Single Nucleotide Polymorphisms (SNPs) were significantly associated with femoral neck, total hip, and total spine BMD at the significance level α = 0.05, respectively

  • This study presents findings from employing various ML models and linear regression, as well as genotype and phenotype data for BMD prediction in older men

Read more

Summary

Introduction

The study aimed to utilize machine learning (ML) approaches and genomic data to develop a prediction model for bone mineral density (BMD) and identify the best modeling approach for BMD prediction. The genomic and phenotypic data of Osteoporotic Fractures in Men Study (n = 5130) was analyzed. Our study suggested that ML models, especially gradient boosting, can improve BMD prediction in genomic data. Osteoporosis and its major complication, osteoporotic fracture, which affects both men and women, cause substantial morbidity and mortality ­worldwide[1]. Major genome-wide association studies (GWAS) and genome-wide meta-analyses have successfully identified numerous BMD-associated Single Nucleotide Polymorphisms (SNPs) associated with decreased ­BMD10. Combining these large number of highly significant SNPs, surprisingly, only explained a very small proportion of BMD ­variance[11]. Such inconsistency may be caused by limitations of the conventional regression approaches employed as these traditional methods lack the flexibility and adequacy of modeling complex interactions and regulations

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call