Abstract

ABSTRACTMotivated by analyses of DNA methylation data, we propose a semiparametric mixture model, namely, the generalized exponential tilt mixture model, to account for heterogeneity between differentially methylated and nondifferentially methylated subjects in the cancer group, and capture the differences in higher order moments (e.g., mean and variance) between subjects in cancer and normal groups. A pairwise pseudolikelihood is constructed to eliminate the unknown nuisance function. To circumvent boundary and nonidentifiability problems as in parametric mixture models, we modify the pseudolikelihood by adding a penalty function. In addition, the test with simple asymptotic distribution has computational advantages compared with permutation-based test for high-dimensional genetic or epigenetic data. We propose a pseudolikelihood-based expectation–maximization test, and show the proposed test follows a simple chi-squared limiting distribution. Simulation studies show that the proposed test controls Type I errors well and has better power compared to several current tests. In particular, the proposed test outperforms the commonly used tests under all simulation settings considered, especially when there are variance differences between two groups. The proposed test is applied to a real dataset to identify differentially methylated sites between ovarian cancer subjects and normal subjects. Supplementary materials for this article are available online.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call