Background and Objective: Standard population genetic theory says that deleterious genetic variants are likely rare and fairly recently introduced. However, can this expectation lead to more powerful tests of association between diseases and rare genetic variation? The gene genealogy describes the relationships between haplotypes sampled from the general population. Although ancestral tree-based methods, inspired by the gene genealogy concept, have been developed for finding associations with common genetic variants, here we ask whether gene genealogies can help in identifying genomic regions containing multiple rare causal variants. Methods: With data simulated under several demographic models and using known gene genealogies, we developed and compared several tree-based statistics to determine which, if any, could detect the type of clustering expected with rare causal variants and whether the genealogic tree provides additional information about disease associations. Results and Conclusions: We found that a novel statistic based on the scaled distance between the tips of a tree performed better than other tree-based statistics. When data were simulated with mild population growth, this statistic outperformed two standard non-tree-based methods, showing that an ancestral tree-based approach has potential for rare variant discovery.
Read full abstract