Abstract

An epigenome-wide association study (EWAS) is a large-scale study of human disease-associated epigenetic variation, specifically variation in DNA methylation. High throughput technologies enable simultaneous epigenetic profiling of DNA methylation at hundreds of thousands of CpGs across the genome. The clustering of correlated DNA methylation at CpGs is reportedly similar to that of linkage-disequilibrium (LD) correlation in genetic single nucleotide polymorphisms (SNP) variation. However, current analysis methods, such as the t-test and rank-sum test, may be underpowered to detect differentially methylated markers. We propose to test the association between the outcome (e.g case or control) and a set of CpG sites jointly. Here, we compared the performance of five CpG set analysis approaches: principal component analysis (PCA), supervised principal component analysis (SPCA), kernel principal component analysis (KPCA), sequence kernel association test (SKAT), and sliced inverse regression (SIR) with Hotelling’s T2 test and t-test using Bonferroni correction. The simulation results revealed that the first six methods can control the type I error at the significance level, while the t-test is conservative. SPCA and SKAT performed better than other approaches when the correlation among CpG sites was strong. For illustration, these methods were also applied to a real methylation dataset.

Highlights

  • DNA polymorphisms explain only a small proportion of inheritance patterns in many complex diseases [1]

  • Type I error rates of principal component analysis (PCA), supervised principal component analysis (SPCA), kernel principal component analysis (KPCA), sequence kernel association test (SKAT), sliced inverse regression (SIR), T2, and t-test based on 10 CpGs are presented in Table 3 and Tables A-C in S1 File

  • The powers of PCA, SPCA, KPCA, and SKAT increased as the correlation among CpGs increased, but there were no apparent trends for SIR, T2, and t-test

Read more

Summary

Introduction

DNA polymorphisms explain only a small proportion of inheritance patterns in many complex diseases [1]. Some of the missing heritability might be explained by epigenetic variation, especially DNA methylation [2]. The DNA methylation state, rather than DNA sequence, is more determinative of gene expression levels [3]. Levels of DNA methylation may “record” an individual’s environmental exposures, and methylation is a potential biomarker for disease diagnosis and risk stratification [4,5].

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call