Abstract Background DNA methylation alterations have been found related to clear cell renal cell carcinoma (ccRCC). In addition, DNA methylation has been used to identify cell heterogeneity at the tumor level using cell deconvolution methods. Other cytosine modifications are less studied in the field. In this study, we aim to explore the use of compositional method, Dirichlet multinomial regression1, to simultaneously explore DNA methylation and DNA hydroxymethylation in ccRCC. Methods A total of 243 clear cell renal cell carcinoma (ccRCC) samples were collected from the Dartmouth Renal Tumor Biobank. DNA methylation analysis was conducted using the Infinium MethylationEPIC BeadChip, using tandem bisulfite (BS) and oxidative bisulfite (oxBS) treatments to differentiate between 5-methylcytosine (5mC) and 5-hydroxymethylcytosine (5hmC) states. Beta values were calculated for all the arrayed samples. Quality control was performed using the ENmix pipeline to ensure the inclusion of high-quality samples using a stringent pOOBHA <0.05. The oxBS maximum likelihood estimate (oxBS-MLE) method was applied to assess 5mC and 5hmC levels based on paired BS and oxBS datasets. To determine the inclusion threshold for 5hmC beta values in the Dirichlet analysis, the R package oxBSCut was used to mask values close to zero2. Missing 5hmC values and those values masked below the detection limit were imputed using a compositional approach and a Tobit regression model to retain the model properties. The Dirichlet model was applied to analyze three methylation states: unmethylated, methylated, and hydroxymethylated, providing comprehensive insights into the methylation landscape of ccRCC samples. Results Here, we analyzed the top 1% most variable methylated sites (n=3020). When comparing tumor vs normal adjacent samples we tested the cytosine modifications in the same model adjusting for sex, age at diagnosis, tumor stage, and grade. For this exploratory analysis, we selected those results that passed an FDR<0.01 and an absolute difference >0.2 for further exploration. Four sites were hyperhydroxymethylated related to MYO5A, DLG2, ANKRD33B, and one open sea site. A total of 23 sites were significantly hypermethylated. Using eForge-TF, these sites tracked to transcription factors targeting estrogen receptors, retinoic acid receptors alpha and gamma, and FOXC1 among others3. Conclusions In this preliminary analysis, using a new modeling approach to interrogate cytosine modifications simultaneously, we found some intriguing results consistent with previous literature. We will expand this analysis to evaluate the relation between extreme changes in hydroxymethylation in particular in relation to alternative splicing alterations. Reference 1. Tsagris, M., Stewart, C. A Dirichlet Regression Model for Compositional Data with Zeros. Lobachevskii J Math39, 398–412 (2018). https://doi.org/10.1134/S1995080218030198 2. Zhang Z, Lee MK, Perreard L, Kelsey KT, Christensen BC, Salas LA. Navigating the hydroxymethylome: experimental biases and quality control tools for the tandem bisulfite and oxidative bisulfite Illumina microarrays. Epigenomics. 2022;14(3):139-152. doi:10.2217/epi-2021-0490 3. Charles E Breeze, Alex P Reynolds, Jenny van Dongen, Ian Dunham, John Lazar, Shane Neph, Jeff Vierstra, Guillaume Bourque, Andrew E Teschendorff, John A Stamatoyannopoulos, Stephan Beck, eFORGE v2.0: updated analysis of cell type-specific signal in epigenomic data, Bioinformatics, Volume 35, Issue 22, November 2019, Pages 4767–4769, https://doi.org/10.1093/bioinformatics/btz456
Read full abstract