Abstract

BackgroundA cost-effective strategy to increase the density of available markers within a population is to sequence a small proportion of the population and impute whole-genome sequence data for the remaining population. Increased densities of typed markers are advantageous for genome-wide association studies (GWAS) and genomic predictions.MethodsWe obtained genotypes for 54 602 SNPs (single nucleotide polymorphisms) in 1077 Franches-Montagnes (FM) horses and Illumina paired-end whole-genome sequencing data for 30 FM horses and 14 Warmblood horses. After variant calling, the sequence-derived SNP genotypes (~13 million SNPs) were used for genotype imputation with the software programs Beagle, Impute2 and FImpute.ResultsThe mean imputation accuracy of FM horses using Impute2 was 92.0%. Imputation accuracy using Beagle and FImpute was 74.3% and 77.2%, respectively. In addition, for Impute2 we determined the imputation accuracy of all individual horses in the validation population, which ranged from 85.7% to 99.8%. The subsequent inclusion of Warmblood sequence data further increased the correlation between true and imputed genotypes for most horses, especially for horses with a high level of admixture. The final imputation accuracy of the horses ranged from 91.2% to 99.5%.ConclusionsUsing Impute2, the imputation accuracy was higher than 91% for all horses in the validation population, which indicates that direct imputation of 50k SNP-chip data to sequence level genotypes is feasible in the FM population. The individual imputation accuracy depended mainly on the applied software and the level of admixture.Electronic supplementary materialThe online version of this article (doi:10.1186/s12711-014-0063-7) contains supplementary material, which is available to authorized users.

Highlights

  • A cost-effective strategy to increase the density of available markers within a population is to sequence a small proportion of the population and impute whole-genome sequence data for the remaining population

  • Rapid innovations in high-throughput sequencing and array technologies have drastically reduced the costs of next-generation sequencing (NGS) [1], which has made it feasible to re-sequence a large fraction of any mammalian genome

  • 50 k SNP-chips typically build the genetic resource for genomic predictions and genome-wide association studies (GWAS) in livestock and other species [2]

Read more

Summary

Introduction

A cost-effective strategy to increase the density of available markers within a population is to sequence a small proportion of the population and impute whole-genome sequence data for the remaining population. Increased densities of typed markers are advantageous for genome-wide association studies (GWAS) and genomic predictions. 50 k SNP (single nucleotide polymorphism)-chips typically build the genetic resource for genomic predictions and genome-wide association studies (GWAS) in livestock and other species [2]. Genotype imputation accuracies have been mainly investigated in cattle, imputing low-density (3 k and 6 k) to medium (50 k) and medium to HD (777 k) SNP panels. The reported genotype imputation accuracies obtained in these studies ranged from 91.2% for imputation from 3 k to 50 k [11], to 99.1% from 6 k to 50 k [13] and to 99.7% from 50 k to 777 k [14]. Imputation accuracies ranged from 82.2% to 100% [7]

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.