Abstract

Abstract In cervical cancer, the promoter CpG islands of several tumor suppressor genes have been found with abnormal DNA methylation (DNAm) levels, which results in transcriptional silencing. Genetic databases from tumor samples are available without the corresponding DNAm profile. To fulfill this gap and contribute to deciphering the link between genetic variants and DNAm, we performed this study to find a good statistical model for predicting DNAm using genetic data. To address this, we downloaded from the TCGA dataset the genes with significant somatic mutations, as reported by the BROAD Institute, and the DNAm beta values, in known CpG sites, from 307 Cervical Squamous Cell Carcinoma and Endocervical Adenocarcinoma (CESC) samples. Clinical data were also downloaded from patients, including race, age, gender, and cancer stage. We tested five mixed models containing both fixed and random effects to include the unmeasured source of variations in DNAm, for which we used the clustering of samples based on DNAm patterns. Models tested were: Classified mixed model prediction (CMMP), Linear regression model (LM), High dimensional shrinkage estimator based on the elastic net (ENET), a combination of CMMP - ENET, and Random Forest prediction (RF). Besides the significantly mutated genes, the clinical variables were also included as covariates to predict DNAm. We used shared random effects to borrow strength across racial groups, which improves predictive accuracy. Models were enhanced by combining other types of cancers to increase data heterogeneity and the number of samples. Lung Adenocarcinoma (LUAD) and Liver Hepatocellular Carcinoma (LIHC) were selected using Gini and Gap statistics for optimal clustering. Model fitting was evaluated using the average mean squared prediction error (MSPE). Results revealed that the CMMP model with CESC LIHC and LUAD data combined showed the best prediction results (lowest MSPE). The plot of the averaged estimated ENET coefficients indicated that the most important-significantly mutated genes to predict DNAm were CTNNB1, DMD, XIRP2, and PIK3CA. SNPs in these genes linked to cervical cancer include rs121913396, rs121913403 (CTNNB1), rs867262025, and rs121913279 (PIK3CA).In summary, we developed a mixed-effects model for accurately predicting DNAm levels in cervical cancer, using genes with significant somatic mutations (SNVs and INDELs). These results also align with growing evidence suggesting that genetic variation plays a role in DNAm. Citation Format: Jairo D. Ramos, J. Sunil Rao. Predicting DNA methylation in cervical cancer using somatic mutations in a classified mixed model prediction. [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2023; Part 1 (Regular and Invited Abstracts); 2023 Apr 14-19; Orlando, FL. Philadelphia (PA): AACR; Cancer Res 2023;83(7_Suppl):Abstract nr 4288.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call