Abstract

Key messageNew fast and accurate method for phasing and imputation of SNP chip genotypes within diploid bi-parental plant populations.This paper presents a new heuristic method for phasing and imputation of genomic data in diploid plant species. Our method, called AlphaPlantImpute, explicitly leverages features of plant breeding programmes to maximise the accuracy of imputation. The features are a small number of parents, which can be inbred and usually have high-density genomic data, and few recombinations separating parents and focal individuals genotyped at low density (i.e. descendants that are the imputation targets). AlphaPlantImpute works roughly in three steps. First, it identifies informative low-density genotype markers in parents. Second, it tracks the inheritance of parental alleles and haplotypes to focal individuals at informative markers. Finally, it uses this low-density information as anchor points to impute focal individuals to high density. We tested the imputation accuracy of AlphaPlantImpute in simulated bi-parental populations across different scenarios. We also compared its accuracy to existing software called PlantImpute. In general, AlphaPlantImpute had better or equal imputation accuracy as PlantImpute. The computational time and memory requirements of AlphaPlantImpute were tiny compared to PlantImpute. For example, accuracy of imputation was 0.96 for a scenario where both parents were inbred and genotyped at 25,000 markers per chromosome and a focal F2 individual was genotyped with 50 markers per chromosome. The maximum memory requirement for this scenario was 0.08 GB and took 37 s to complete.

Highlights

  • This paper presents a new heuristic method for phasing and imputation of single-nucleotide polymorphism (SNP) array data in diploid plant species

  • This paper presents a new heuristic method, called AlphaPlantImpute, for phasing and imputation of SNP array data in diploid plant species

  • Increasing the number of LD markers beyond 10 markers mitigates the decrease in the average imputation accuracy between F2 focal individuals and F10 focal individuals

Read more

Summary

Introduction

This paper presents a new heuristic method for phasing and imputation of single-nucleotide polymorphism (SNP) array data in diploid plant species. An effective strategy to overcome this cost barrier is to genotype a proportion of the population at high-density, phase their genotypes, and use this data for imputation of large numbers of individuals genotyped at low-density (Jacobson et al 2014, 2015; Gorjanc et al 2017a, b). This strategy has been widely adopted in livestock and human populations, partly because

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call