Abstract

BackgroundIn this paper, we extend multi-locus iterative peeling to provide a computationally efficient method for calling, phasing, and imputing sequence data of any coverage in small or large pedigrees. Our method, called hybrid peeling, uses multi-locus iterative peeling to estimate shared chromosome segments between parents and their offspring at a subset of loci, and then uses single-locus iterative peeling to aggregate genomic information across multiple generations at the remaining loci.ResultsUsing a synthetic dataset, we first analysed the performance of hybrid peeling for calling and phasing genotypes in disconnected families, which contained only a focal individual and its parents and grandparents. Second, we analysed the performance of hybrid peeling for calling and phasing genotypes in the context of a full general pedigree. Third, we analysed the performance of hybrid peeling for imputing whole-genome sequence data to non-sequenced individuals in the population. We found that hybrid peeling substantially increased the number of called and phased genotypes by leveraging sequence information on related individuals. The calling rate and accuracy increased when the full pedigree was used compared to a reduced pedigree of just parents and grandparents. Finally, hybrid peeling imputed accurately whole-genome sequence to non-sequenced individuals.ConclusionsWe believe that this algorithm will enable the generation of low cost and high accuracy whole-genome sequence data in many pedigreed populations. We make this algorithm available as a standalone program called AlphaPeel.

Highlights

  • In this paper, we extend multi-locus iterative peeling to provide a computationally efficient method for calling, phasing, and imputing sequence data of any coverage in small or large pedigrees

  • Code availability To perform hybrid peeling, we used the software package AlphaPeel, which is available from the AlphaGenes website

  • Calling and phasing in disconnected families We found that hybrid peeling yielded a high percentage and accuracy of called genotypes and phased alleles even with low coverage sequencing

Read more

Summary

Introduction

We extend multi-locus iterative peeling to provide a computationally efficient method for calling, phasing, and imputing sequence data of any coverage in small or large pedigrees. An emerging strategy in breeding populations is to obtain a mix of high and low coverage sequence data on a subset of individuals, and to propagate that information between related individuals to call whole-genome sequence genotypes for all members of a population, some of which may have only single nucleotide polymorphism (SNP) array genotype data [9]. This strategy exploits the high degree of relatedness and haplotype sharing between individuals in a breeding

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call