On the use of whole-genome sequence data for across-breed genomic prediction and fine-scale mapping of QTL

Theo Meuwissen,Irene Van Den Berg,Mike Goddard

doi:10.1186/s12711-021-00607-4

Theo Meuwissen, Irene Van Den Berg + Show 1 more

Open Access

https://doi.org/10.1186/s12711-021-00607-4

Copy DOI

Abstract

BackgroundWhole-genome sequence (WGS) data are increasingly available on large numbers of individuals in animal and plant breeding and in human genetics through second-generation resequencing technologies, 1000 genomes projects, and large-scale genotype imputation from lower marker densities. Here, we present a computationally fast implementation of a variable selection genomic prediction method, that could handle WGS data on more than 35,000 individuals, test its accuracy for across-breed predictions and assess its quantitative trait locus (QTL) mapping precision.MethodsThe Monte Carlo Markov chain (MCMC) variable selection model (Bayes GC) fits simultaneously a genomic best linear unbiased prediction (GBLUP) term, i.e. a polygenic effect whose correlations are described by a genomic relationship matrix (G), and a Bayes C term, i.e. a set of single nucleotide polymorphisms (SNPs) with large effects selected by the model. Computational speed is improved by a Metropolis–Hastings sampling that directs computations to the SNPs, which are, a priori, most likely to be included into the model. Speed is also improved by running many relatively short MCMC chains. Memory requirements are reduced by storing the genotype matrix in binary form. The model was tested on a WGS dataset containing Holstein, Jersey and Australian Red cattle. The data contained 4,809,520 genotypes on 35,549 individuals together with their milk, fat and protein yields, and fat and protein percentage traits.ResultsThe prediction accuracies of the Jersey individuals improved by 1.5% when using across-breed GBLUP compared to within-breed predictions. Using WGS instead of 600 k SNP-chip data yielded on average a 3% accuracy improvement for Australian Red cows. QTL were fine-mapped by locating the SNP with the highest posterior probability of being included in the model. Various QTL known from the literature were rediscovered, and a new SNP affecting milk production was discovered on chromosome 20 at 34.501126 Mb. Due to the high mapping precision, it was clear that many of the discovered QTL were the same across the five dairy traits.ConclusionsAcross-breed Bayes GC genomic prediction improved prediction accuracies compared to GBLUP. The combination of across-breed WGS data and Bayesian genomic prediction proved remarkably effective for the fine-mapping of QTL.

Highlights

Whole-genome sequence (WGS) data are increasingly available on large numbers of individuals in animal and plant breeding and in human genetics through second-generation resequencing technologies, 1000 genomes projects, and large-scale genotype imputation from lower marker densities
Accuracy of prediction declines if the target population is not closely related to the training population because the linkage disequilibrium (LD) between markers and causal variants differs between populations
quantitative trait locus (QTL) mapping Figure 1 shows the Manhattan plot of the variances of local genomic breeding value estimates (GEBV) for fat percentage calculated in 250-kb regions across the genome, as an indicator for the genetic variance contained in the regions [23], which indicates whether the region contains important QTL

Summary

Introduction

Whole-genome sequence (WGS) data are increasingly available on large numbers of individuals in animal and plant breeding and in human genetics through second-generation resequencing technologies, 1000 genomes projects, and large-scale genotype imputation from lower marker densities. Meuwissen et al Genet Sel Evol (2021) 53:19 numbers of individuals in animal and plant breeding, and in humans This is due to cost-effective second-generation resequencing technologies, in combination with 1000 genomes projects (e.g. for humans [1]; plants [2]; and livestock [3]). A method of genomic prediction that maintains higher accuracy when the training and target populations are not closely related is desirable Part of such a method would exploit high-density marker or whole-genome sequence (WGS) data because markers that are close to the causal variants, or the causal variants themselves, are included in the data [7]. To make effective use of such high-density markers, a method of variable selection is needed so that the causal variants or markers in high LD with them dominate the prediction

Objectives

Methods

Results

Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Genetics Selection Evolution	Publication Date: Feb 26, 2021
Citations: 23	License type: open-access

R Discovery Prime

R Discovery Prime

On the use of whole-genome sequence data for across-breed genomic prediction and fine-scale mapping of QTL

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Genetics Selection Evolution

Lead the way for us

Similar Papers

Integrating Omics Data into Genomic Prediction
Zhengcao Li
-
Zhengcao LiZhengcao Li
21 Feb 2022
21 Feb 2022

Pre-selecting markers based on fixation index scores improved the power of genomic evaluations in a combined Yorkshire pig population
S Ye ... J Li
Animal | VOL. 14
S Ye, et. al.S Ye ... J Li
01 Jan 2020
Animal | VOL. 14

Using imputation-based whole-genome sequencing data to improve the accuracy of genomic prediction for combined populations in pigs
Hailiang Song ... Qin Zhang
Genetics, selection, evolution : GSE | VOL. 51
Hailiang Song, et. al.Hailiang Song ... Qin Zhang
21 Oct 2019
Genetics, selection, evolution : GSE | VOL. 51

The effect of high-density genotypic data and different methods on joint genomic prediction: A case study in large white pigs.
Wei Zhao ... Zhen Wang
Animal Genetics | VOL. 54
Wei Zhao, et. al.Wei Zhao ... Zhen Wang
22 Nov 2022
Animal Genetics | VOL. 54

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

On the use of whole-genome sequence data for across-breed genomic prediction and fine-scale mapping of QTL

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Genetics Selection Evolution