Abstract

Genetic Analysis Workshop 18 provided a platform for evaluating genomic prediction power based on single-nucleotide polymorphisms from single-nucleotide polymorphism array data and sequencing data. Also, Genetic Analysis Workshop 18 provided a diverse pedigree structure to be explored in prediction. In this study, we attempted to combine pedigree information with single-nucleotide polymorphism data to predict systolic blood pressure. Our results suggested that the prediction power based on pedigree information only could be unsatisfactory. Using additional information such as single-nucleotide polymorphism genotypes would improve prediction accuracy. In particular, the improvement can be significant when there exist a few single-nucleotide polymorphisms with relatively larger effect sizes. We also compared the prediction performance based on genome-wide association study data (ie, common variants) and sequencing data (ie, common variants plus low-frequency variants). The experimental result showed that inclusion of low frequency variants could not lead to improvement of prediction accuracy.

Highlights

  • Genomic prediction is an important problem in genetics

  • E ∼ N(0, σe2I), where y ∈ Rn×1 is the response vector; X ∈ Rn×d is the matrix of covariates, including the intercept and other covariates, such as age and gender; β is the vector for regression coefficients of the covariates; G ∈ Rn×p is the genotype matrix and α is the coefficient vector for p single-nucleotide polymorphisms (SNPs); u is the random effect from N(0,σu2K); and e is the residual error with variance σe2

  • Penalized linear mixed model There is a difficulty in applying the model when d+p +2>n,ie, the number of parameters exceeds the number of samples (d is the number of covariates, p is the number of SNPs treated as fixed effects, 2 is the number of variance components)

Read more

Summary

Introduction

Genomic prediction is an important problem in genetics. It aims at predicting a phenotype outcome based on information from genetic markers, population, pedigree structures, and other relevant covariates. A larger sample size is needed to estimate those small effects more accurately. A larger sample size leads to the improvement of prediction accuracy. Low frequency variants (minor allele frequency [MAF] ≤5%) have not been directly observed in genome-wide association studies (GWAS). The contribution of these lowfrequency variants has not been taken into account in predictive models, which may result in the loss of prediction accuracy

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call