Abstract

The imputation from lower density SNP chip genotypes to whole-genome sequence level is an established approach to generate high density genotypes for many individuals. Imputation accuracy is dependent on many factors and for small cattle populations such as the endangered German Black Pied cattle (DSN), determining the optimal imputation strategy is especially challenging since only a low number of high density genotypes is available. In this paper, the accuracy of imputation was explored with regard to (1) phasing of the target population and the reference panel for imputation, (2) comparison of a 1-step imputation approach, where 50 k genotypes are directly imputed to sequence level, to a 2-step imputation approach that used an intermediate step imputing first to 700 k and subsequently to sequence level, (3) the software tools Beagle and Minimac, and (4) the size and composition of the reference panel for imputation. Analyses were performed for 30 DSN and 30 Holstein Frisian cattle available from the 1000 Bull Genomes Project. Imputation accuracy was assessed using a leave-one-out cross validation procedure. We observed that phasing of the target populations and the reference panels affects the imputation accuracy significantly. Minimac reached higher accuracy when imputing using small reference panels, while Beagle performed better with larger reference panels. In contrast to previous research, we found that when a low number of animals is available at the intermediate imputation step, the 1-step imputation approach yielded higher imputation accuracy compared to a 2-step imputation. Overall, the size of the reference panel for imputation is the most important factor leading to higher imputation accuracy, although using a larger reference panel consisting of a related but different breed (Holstein Frisian) significantly reduced imputation accuracy. Our findings provide specific recommendations for populations with a limited number of high density genotyped or sequenced animals available such as DSN. The overall recommendation when imputing a small population are to (1) use a large reference panel of the same breed, (2) use a large reference panel consisting of diverse breeds, or (3) when a large reference panel is not available, we recommend using a smaller same breed reference panel without including a different related breed.

Highlights

  • Imputation from lower density SNP chip genotypes to wholegenome sequencing level is a practical, cheap and fast method to generate high density genotypes for many individuals

  • When imputing the DSN target population with the “1000 bulls” reference panel from 50 k to sequence level, significant differences in imputation accuracy were observed with regard to different phasing strategies (Figure 2 and Supplementary Table 1)

  • The highest mean imputation accuracy (93.0%) was observed when the target population and the reference panel were both phased with Beagle (Figure 2 and Supplementary Table 2)

Read more

Summary

Introduction

Imputation from lower density SNP chip genotypes to wholegenome sequencing level is a practical, cheap and fast method to generate high density genotypes for many individuals. The 1000 Bull Genomes Project offers a large reference panel including many HF animals on sequence level that can be used for imputation within the same breed (Brøndum et al, 2014; van Binsbergen et al, 2015; Pausch et al, 2017). Imputation to higher density is especially challenging in populations where only a low number of high density genotypes or sequence data is available. One such population is the endangered German Black Pied cattle (DSN, “Deutsches Schwarzbuntes Niederungsrind”) which is considered to be one of the founder breeds of German HF (Porter, 1991). Finding an optimal imputation strategy for such small populations is necessary in order to obtain high quality imputed genotypes for further analyses

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.