Abstract
BackgroundEpigenome-wide association studies (EWAS), which seek the association between epigenetic marks and an outcome or exposure, involve multiple hypothesis testing. False discovery rate (FDR) control has been widely used for multiple testing correction. However, traditional FDR control methods do not use auxiliary covariates, and they could be less powerful if the covariates could inform the likelihood of the null hypothesis. Recently, many covariate-adaptive FDR control methods have been developed, but application of these methods to EWAS data has not yet been explored. It is not clear whether these methods can significantly improve detection power, and if so, which covariates are more relevant for EWAS data.ResultsIn this study, we evaluate the performance of five covariate-adaptive FDR control methods with EWAS-related covariates using simulated as well as real EWAS datasets. We develop an omnibus test to assess the informativeness of the covariates. We find that statistical covariates are generally more informative than biological covariates, and the covariates of methylation mean and variance are almost universally informative. In contrast, the informativeness of biological covariates depends on specific datasets. We show that the independent hypothesis weighting (IHW) and covariate adaptive multiple testing (CAMT) method are overall more powerful, especially for sparse signals, and could improve the detection power by a median of 25% and 68% on real datasets, compared to the ST procedure. We further validate the findings in various biological contexts.ConclusionsCovariate-adaptive FDR control methods with informative covariates can significantly increase the detection power for EWAS. For sparse signals, IHW and CAMT are recommended.
Highlights
Epigenome-wide association studies (EWAS), which seek the association between epigenetic marks and an outcome or exposure, involve multiple hypothesis testing
Overview of the EWAS datasets and covariates selected for evaluation In this study, 61 EWAS datasets were collected based on 58 Gene Expression Omnibus (GEO) methylation datasets whose platforms were Infinium Human Methylation 450K BeadChip
The constructed surrogate variables were included as covariates in the regression model to account for potential confounding effects
Summary
Epigenome-wide association studies (EWAS), which seek the association between epigenetic marks and an outcome or exposure, involve multiple hypothesis testing. Illumina’s Infinium Human Methylation 450K BeadChip and EPIC BeadChip, which cover more than 450,000 and 850,000 CpG methylation sites respectively, are two predominant products in the market. The availability of these high-density methylation arrays has fueled epigenome-wide association studies (EWAS), which seek to identify methylation variants associated with an outcome or exposure of interest [18,19,20,21,22,23]. Two statistical approaches have been developed to address multiple testing: family-wise error rate (FWER) and false discovery rate (FDR) control. Compared to the BH procedure, the ST procedure considers the proportion of null hypotheses and is more powerful when the signal is dense
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.