Abstract

BackgroundImputation accuracy among other things depends on the size of the reference panel, the marker’s minor allele frequency (MAF), and the correct placement of single nucleotide polymorphism (SNP) on the reference genome assembly. Using high-density genotypes of 3938 Nellore cattle from Brazil, we investigated the accuracy of imputation from 50 K to 777 K SNP density using Minimac3, when map positions were determined according to the bovine genome assemblies UMD3.1 and ARS-UCD1.2. We assessed the effect of reference and target panel sizes on the pre-phasing based imputation quality using ten-fold cross-validation. Further, we compared the reliability of the model-based imputation quality score (Rsq) from Minimac3 to the empirical imputation accuracy.ResultsThe overall accuracy of imputation measured as the squared correlation between true and imputed allele dosages (R2dose) was almost identical using either the UMD3.1 or ARS-UCD1.2 genome assembly. When the size of the reference panel increased from 250 to 2000, R2dose increased from 0.845 to 0.917, and the number of polymorphic markers in the imputed data set increased from 586,701 to 618,660. Advantages in both accuracy and marker density were also observed when larger target panels were imputed, likely resulting from more accurate haplotype inference. Imputation accuracy increased from 0.903 to 0.913, and the marker density in the imputed data increased from 593,239 to 595,570 when haplotypes were inferred in 500 and 2900 target animals. The model-based imputation quality scores from Minimac3 (Rsq) were systematically higher than empirically estimated accuracies. However, both metrics were positively correlated and the correlation increased with the size of the reference panel and MAF of imputed variants.ConclusionsAccurate imputation of BovineHD BeadChip markers is possible in Nellore cattle using the new bovine reference genome assembly ARS-UCD1.2. The use of large reference and target panels improves the accuracy of the imputed genotypes and provides genotypes for more markers segregating at low frequency for downstream genomic analyses. The model-based imputation quality score from Minimac3 (Rsq) can be used to detect poorly imputed variants but its reliability depends on the size of the reference panel and MAF of the imputed variants.

Highlights

  • Imputation accuracy among other things depends on the size of the reference panel, the marker’s minor allele frequency (MAF), and the correct placement of single nucleotide polymorphism (SNP) on the reference genome assembly

  • Target panels genotyped with low-density single nucleotide polymorphism (SNP) chips can be imputed to higher density using a reference panel typed at higher density

  • Comparison of the accuracy of imputation between the UMD3.1 and ARS-UCD1.2 genome assemblies We considered 3938 Nellore cattle genotyped at 777,962 SNPs using the Illumina BovineHD BeadChip to study the effect of reference genome assemblies on the imputation accuracy

Read more

Summary

Introduction

Imputation accuracy among other things depends on the size of the reference panel, the marker’s minor allele frequency (MAF), and the correct placement of single nucleotide polymorphism (SNP) on the reference genome assembly. The software that implement a pre-phasing-based strategy [12], for example, Minimac3 [13] and IMPUTE2 [14], impute genotypes into phased haplotypes. These tools are probably more sensitive to the size of the target panel because the phasing accuracy depends on the number and the relatedness of the samples considered [15,16,17]. Inferred haplotypes might compromise the quality of subsequent imputation

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.