Abstract

The statistical analysis of omics data poses a great computational challenge given its ultra-high dimensional nature and frequent between-features correlation. In this work, we extended the Iterative Sure Independence Screening (ISIS) algorithm by pairing ISIS with elastic-net (Enet) and two versions of adaptive Enet (AEnet and MSAEnet) to efficiently improve feature selection and effect estimation in omics research. We subsequently used genome-wide human blood DNA methylation data from American Indians of the Strong Heart Study (N=2,235 participants), measured in 1989-1991, to compare the performance (predictive accuracy, coefficient estimation and computational efficiency) of SIS-paired regularization methods to Bayesian shrinkage and traditional linear regression to identify epigenomic multi-marker of body mass index. ISIS-AEnet outperformed the other methods in prediction. In biological pathway enrichment analysis of genes annotated to BMI-related differentially methylated positions, ISIS-AEnet captured most of the enriched pathways in common for at least two of all the evaluated methods. ISIS-AEnet can favor biological discovery because it identifies the most robust biological pathways while achieving an optimal balance between bias and efficient feature selection. In the extended SIS R package, we also implemented ISIS paired with Cox and logistic regression for time-to-event and binary endpoints, respectively, and bootstrap confidence intervals for the estimated regression coefficients.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call