Haplotype Analysis of Genomic Prediction Using Structural and Functional Genomic Information for Seven Human Phenotypes.

Zuoxiang Liang,Li Ma,Dzianis Prakapenka,Yang Da,Cheng Tan

doi:10.3389/fgene.2020.588907

Abstract

Genomic prediction using multi-allelic haplotype models improved the prediction accuracy for all seven human phenotypes, the normality transformed high density lipoproteins, low density lipoproteins, total cholesterol, triglycerides, weight, and the original height and body mass index without normality transformation. Eight SNP sets with 40,941-380,705 SNPs were evaluated. The increase in prediction accuracy due to haplotypes was 1.86-8.12%. Haplotypes using fixed chromosome distances had the best prediction accuracy for four phenotypes, fixed number of SNPs for two phenotypes, and gene-based haplotypes for high density lipoproteins and height (tied for best). Haplotypes of coding genes were more accurate than haplotypes of all autosome genes that included both coding and noncoding genes for triglycerides and weight, and nearly the same as haplotypes of all autosome genes for the other phenotypes. Haplotypes of noncoding genes (mostly lncRNAs) only improved the prediction accuracy over the SNP models for high density lipoproteins, total cholesterol, and height. ChIP-seq haplotypes had better prediction accuracy than gene-based haplotypes for total cholesterol, body mass index and low density lipoproteins. The accuracy of ChIP-seq haplotypes was most striking for low density lipoproteins, where all four haplotype models with ChIP-seq haplotypes had similarly high prediction accuracy over the best prediction model with gene-based haplotypes. Haplotype epistasis was shown to be the reason for the increased accuracy due to haplotypes. Low density lipoproteins had the largest haplotype epistasis heritability that explained 14.70% of the phenotypic variance and was 31.27% of the SNP additive heritability, and the largest increase in prediction accuracy relative to the best SNP model (8.12%). Relative to the SNP additive heritability of the same regions, noncoding genes had the highest haplotype epistasis heritability, followed by coding genes and ChIP-seq for the seven phenotypes. SNP and haplotype heritability profiles showed that the integration of SNP and haplotype additive values compensated the weakness of haplotypes in estimating SNP heritabilities for four phenotypes, whereas models with haplotype additive values fully accounted for SNP additive values for three phenotypes. These results showed that haplotype analysis can be a method to utilize functional and structural genomic information to improve the accuracy of genomic prediction.

Highlights

Genomic selection using genome-wide single nucleotide polymorphism (SNP) markers has been widely used in livestock and crop species (Meuwissen et al, 2016; Crossa et al, 2017), and genomic prediction has been applied to the prediction of human phenotypes (Maier et al, 2018; Lello et al, 2019)
The summary below focuses on the results of the best prediction models using the 380K SNP set with minor allele frequencies (MAF) of 0.05 and 320K SNP set with MAF of 0.10, and the complete results for each haplotype blocking method are shown in Supplementary Figures 4–7
Results in this study showed haplotypes using structural and functional genomic information improved the accuracy of genomic prediction

Summary

Introduction

Genomic selection using genome-wide single nucleotide polymorphism (SNP) markers has been widely used in livestock and crop species (Meuwissen et al, 2016; Crossa et al, 2017), and genomic prediction has been applied to the prediction of human phenotypes (Maier et al, 2018; Lello et al, 2019). Methods used in these studies to define haplotype blocks for genomic prediction include a fixed number of SNPs per haplotype block (Calus et al, 2008; Villumsen et al, 2009; Jiang et al, 2018; Sallam et al, 2020; Won et al, 2020), fixed block length (Hess et al, 2017; Won et al, 2020), or linkage disequilibrium (LD) blocks (Boichard et al, 2012; Cuyabano et al, 2015; Jónás et al, 2017; Jan et al, 2019; Won et al, 2020) These haplotype studies had mixed results ranging from decreases to substantial increases in prediction accuracy due to haplotypes relative to SNP models, but the reasons for the successes and failures of haplotype genomic prediction were unknown

Methods

Results

Conclusion