Abstract

Epigenetic clock, a highly accurate age estimator based on DNA methylation (DNAm) level, is the basis for predicting mortality/morbidity and elucidating the molecular mechanism of aging, which is of great significance in forensics, justice, and social life. Herein, we integrated machine learning (ML) algorithms to construct blood epigenetic clock in Southern Han Chinese (CHS) for chronological age prediction. The correlation coefficient (r) meta-analyses of 7,084 individuals were firstly implemented to select five genes (ELOVL2, C1orf132, TRIM59, FHL2, and KLF14) from a candidate set of nine age-associated DNAm biomarkers. The DNAm-based profiles of the CHS cohort (240 blood samples differing in age from 1 to 81 years) were generated by the bisulfite targeted amplicon pyrosequencing (BTA-pseq) from 34 cytosine-phosphate-guanine sites (CpGs) of five selected genes, revealing that the methylation levels at different CpGs exhibit population specificity. Furthermore, we established and evaluated four chronological age prediction models using distinct ML algorithms: stepwise regression (SR), support vector regression (SVR-eps and SVR-nu), and random forest regression (RFR). The median absolute deviation (MAD) values increased with chronological age, especially in the 61–81 age category. No apparent gender effect was found in different ML models of the CHS cohort (all p > 0.05). The MAD values were 2.97, 2.22, 2.19, and 1.29 years for SR, SVR-eps, SVR-nu, and RFR in the CHS cohort, respectively. Eventually, compared to the MAD range of the meta cohort (2.53–5.07 years), a promising RFR model (ntree = 500 and mtry = 8) was optimized with an MAD of 1.15 years in the 1–60 age categories of the CHS cohort, which could be regarded as a robust epigenetic clock in blood for age-related issues.

Highlights

  • Aging is an inevitable, universal and natural phenomenon that occurs with age, characterized by progressive decline in organismal function and more susceptible to irreversible degenerative disease and even death (Sen et al, 2016)

  • Garali et al compared six different statistical models with the multiple linear regression (MLR) model of Zbiec-Pierkarska (Zbieć-Piekarska et al, 2015b), and the results suggested that multiple quadratic regression (MQR), SVM, gradient boosting regressor (GBR), and MissMDA models outperformed the MLR model for age prediction from ELOVL2 (Garali et al, 2020)

  • We found that the age prediction accuracy decreases with chronological age in different machine learning (ML) models (Figures 4C–F)

Read more

Summary

Introduction

Universal and natural phenomenon that occurs with age, characterized by progressive decline in organismal function and more susceptible to irreversible degenerative disease and even death (Sen et al, 2016). Epigenetics is often defined by changes in gene function that do not involve any changes in DNA sequence, and epigenetic changes during aging mainly include histone modification and DNA methylation (DNAm) (Parson, 2018; Unnikrishnan et al, 2019). An initial study of age-associated methylation in normal tissue was motivated by the study of methylation in cancer (Esteller, 2002). Christensen et al verified this by proposing that variations in age- and exposure-related methylation may significantly contribute to increased susceptibility to several diseases (Christensen et al, 2009). Emerging studies are beginning to work on the associations between methylation profiles and human tissues; most of them have focused on therapeutic targets for pathological tissues (Suzuki et al, 2006; Portela and Esteller, 2010; Gao et al, 2019)

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call