Abstract

Lower grade gliomas (LGG, World Health Organization Grade II and III) are characterized by variable clinical behavior and variable outcomes. While previously identified clinically molecular stratification factors add prognostic insight over histologic classification alone, significant heterogeneity exists with respect to overall outcome among patients. The purpose of this study was to evaluate nine machine learning algorithms on a combined clinical, mutational and methylation data of 293 adult LGGs.293 clinically annotated LGG samples with genome-wide methylation data available in the TCGA atlas were included in the study. Clinically relevant information included age at diagnosis, sex, IDH, TP53, ATRX mutations, 1p/19q status, tumor grade, histology, disease status and overall survival status. Methylation data of 450 genes with the highest variance across all samples were included. We used a total of nine machine learning algorithms in RapidMiner to evaluate the accuracy of each in predicting the overall survival status (deceased/living.) Those included tree-based and neural networks-based algorithms, support vector machine, and linear regression.Gradient boosted trees achieved the greatest predictive accuracy (86.76%, living and deceased class recall of 84.75% and 91.67%, respectively), followed by Decision Tree (84.52%, class recall of 85.25 and 82.61%, respectively). The largest weight in each model was disease-free status. Combined clinical and methylation data achieved greater sensitivity and specificity compared to either methylation or clinical data only: methylation data-based analysis without inclusion of clinical data achieved maximum accuracy of 79.56% with Deep Learning (living and deceased class recall of 95% and 39.13%, respectively), while clinical data-only based analysis showed highest accuracy of 89.16% with Random Forests, with a class recall for living and deceased of 98.36% and 63.64%, respectively.The combination of clinical and methylation data achieves a remarkable accuracy in predicting overall survival outcome in LGG patients, with highest overall class recall. The class recall (true positive rate or sensitivity) for deceased patients of combined data is superior to both clinical data-only and methylation data-only based analysis.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.