Abstract Motivation Permutation-based significance thresholds have been shown to be a robust alternative to classical Bonferroni significance thresholds in genome-wide association studies for skewed phenotype distributions. The recently published method permGWAS introduced a batch-wise approach to efficiently compute permutation-based genome-wide association studies. However, running multiple univariate tests in parallel leads to many repetitive computations and increased computational resources. More importantly, traditional permutation methods that permute only the phenotype break the underlying population structure. Results We propose permGWAS2, an improved method that does not break the population structure during permutations and uses an elegant block matrix decomposition to optimize computations, thereby reducing redundancies. We show on synthetic data that this improved approach yields a lower false discovery rate for skewed phenotype distributions compared to the previous version and the commonly used Bonferroni correction. In addition, we re-analyze a dataset covering phenotypic variation in 86 traits in a population of 615 wild sunflowers (Helianthus annuus L.). This led to the identification of dozens of novel associations with putatively adaptive traits, and removed several likely false-positive associations with limited biological support. Availability permGWAS2 is open-source and publicly available on GitHub for download: https://github.com/grimmlab/permGWAS. Supplementary information Supplementary data are available at Bioinformatics Advances online.
Read full abstract