Genomic differentiation as a tool for single nucleotide polymorphism prioritization for Genome wide association and phenotype prediction in livestock

Sajjad Toghiani,Sammy E Aggrey,Ashley Ling,Ling-Yun Chang,Romdhane Rekaya

doi:10.1016/j.livsci.2017.09.007

Abstract

Abstract Genome-wide association studies (GWAS) have been successful in detecting associations between single nucleotide polymorphisms (SNPs) and phenotypic variation and in identifying several causative mutations. However, SNPs with significant association identified using GWAS tend to explain only small fraction of the phenotypic variations. GWAS are affected by lack of power due to small sample size, large numbers of highly correlated markers, and the moderate to small effects of most quantitative trait loci (QTLs). This situation is further complicated by the continuous increase in marker density, especially with the availability of next-generation sequencing (NGS) data. The latter generates an unprecedented number of marker variants, with a complex linkage disequilibrium (LD) structure limiting the advantage and adequacy of existing methods that internally try to prioritize (filter) SNPs (e.g. BayesB, and BayesR). Consequently, it is becoming necessary to either filter SNPs before conducting the association analysis or to enlist additional sources of information. Methods that include biological prior information (e.g. BayesRC) are limited by the amount and quality of available prior information. Knowledge of genetic diversity based on evolutionary forces is beneficial for tracking loci influenced by selection. The fixation index (F ST ), as a measure of allele frequency variation among sub-populations, provides a tool to reveal genomic regions under selection pressure. In order to evaluate its usefulness as an additional source of information, a simulation was carried out. A trait with heritability of 0.4 was simulated and three subpopulations were created based on the empirical phenotypic distribution ( 95% quantile; and between 5% and 95% quantiles). Marker data was simulated to mimic a bovine chip of 600 K, 1 million, and 3 million SNP marker panels. Genetic complexity of the trait was modelled by the number of QTLs, their distribution, and the magnitude of their effects. Using different empirical cut off values for F ST , most QTLs were correctly detected using as few as 2.5% of SNP markers in the panels. Furthermore, the genomic similarity, calculated based on the selected SNPs, was very high (>0.80) for individuals with similar genetic and phenotypic values despite having limited to no pedigree relationship. These results indicate that filtering SNPs using F ST could be beneficial for use in GWAS by focusing on genome regions under selection pressure. High functional genomic similarity based on selected markers indicates similarity in SNP signatures, regardless of relatedness, and translates into high phenotypic correlation that could be used in decision making.

Full Text