Eigenvalue significance testing for genetic association.

Yi-Hui Zhou,Fred A Wright,J S Marron

doi:10.1111/biom.12767

Yi-Hui Zhou, Fred A Wright + Show 1 more

Open Access

https://doi.org/10.1111/biom.12767

Copy DOI

Journal: Biometrics	Publication Date: Aug 29, 2017
Citations: 10	License type: CC BY-NC-ND 4.0

Affiliation: North Carolina State University, University North

Abstract

Genotype eigenvectors are widely used as covariates for control of spurious stratification in genetic association. Significance testing for the accompanying eigenvalues has typically been based on a standard Tracy-Widom limiting distribution for the largest eigenvalue, derived under white-noise assumptions. It is known that even modest local correlation among markers inflates the largest eigenvalues, even in the absence of true stratification. In addition, a few sample eigenvalues may be extreme, creating further complications in accurate testing. We explore several methods to identify appropriate null eigenvalue thresholds, while remaining sensitive to eigenvalues corresponding to population stratification. We introduce a novel block permutation approach, designed to produce an appropriate null eigenvalue distribution by eliminating long-range genomic correlation while preserving local correlation. We also propose a fast approach based on eigenvalue distribution modeling, using a simple fit criterion and the general Marčenko-Pastur equation under a simple discrete eigenvalue model. Block permutation and the model-based approach work well for pure simulations and for data resampled from the 1000 Genomes project. In contrast, we find that the standard approach of computing an "effective" number of markers does not perform well. The performance of the methods is also demonstrated for a motivating example from the International Cystic Fibrosis Consortium.

Full Text