Abstract

Successful management and utilization of increasingly large genomic datasets is essential for breeding programs to accelerate cultivar development. To help with this, we developed a Sorghum bicolor Practical Haplotype Graph (PHG) pangenome database that stores haplotypes and variant information. We developed two PHGs in sorghum that were used to identify genome-wide variants for 24 founders of the Chibas sorghum breeding program from 0.01x sequence coverage. The PHG called single nucleotide polymorphisms (SNPs) with 5.9% error at 0.01x coverage-only 3% higher than PHG error when calling SNPs from 8x coverage sequence. Additionally, 207 progenies from the Chibas genomic selection (GS) training population were sequenced and processed through the PHG. Missing genotypes were imputed from PHG parental haplotypes and used for genomic prediction. Mean prediction accuracies with PHG SNP calls range from .57-.73 and are similar to prediction accuracies obtained with genotyping-by-sequencing or targeted amplicon sequencing (rhAmpSeq) markers. This study demonstrates the use of a sorghum PHG to impute SNPs from low-coverage sequence data and shows that the PHG can unify genotype calls across multiple sequencing platforms. By reducing input sequence requirements, the PHG can decrease the cost of genotyping, make GS more feasible, and facilitate larger breeding populations. Our results demonstrate that the PHG is a useful research and breeding tool that maintains variant information from a diverse group of taxa, stores sequence data in a condensed but readily accessible format, unifies genotypes across genotyping platforms, and provides a cost-effective option for genomic selection.

Highlights

  • The goal of plant breeding is to develop improved cultivars with high yield potential and better-quality traits while reducing input requirements and environmental impact

  • One contains only the original founder haplotypes of the Chibas breeding population (“founder Practical Haplotype Graph (PHG)”, 24 genotypes), while the other PHG contains both the Chibas founders and whole-genome sequences (WGS) from an additional 374 taxa that reflect the overall diversity within sorghum (“diversity PHG,” 398 genotypes)

  • A genomic variant call format (VCF) (gVCF) file is made by calling variants between WGS and the reference genome, and variants from the gVCF are added to the PHG database in all genic reference ranges

Read more

Summary

Introduction

The goal of plant breeding is to develop improved cultivars with high yield potential and better-quality traits while reducing input requirements and environmental impact. As the field of genetics developed, plant breeders began to use marker-assisted selection to associate genetic markers with desirable traits and inform breeding decisions (reviewed in Ramstein, Jensen, & Buckler, 2018). First proposed by Meuwissen, Hayes, and Goddard (2001), genomic selection (GS) is an extension of marker-assisted selection that uses genome-wide markers to make predictions about individual performance. Many studies have shown that GS can accelerate the breeding process and rate of genetic gain without significantly increasing program costs (e.g., Bernardo & Yu, 2007; Heffner, Lorenz, Jannink, & Sorrells, 2010; Heslot, Jannink, & Sorrells, 2015; Meuwissen et al, 2001; Muleta, Pressoir, & Morris, 2019; Poland et al, 2012). Dense marker or haplotype maps in major crops like maize (Zea mays; Bukowski et al, 2018) and sorghum (Sorghum bicolor (L.) Moench; Lozano et al, 2019) can be leveraged to inform breeding decisions

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call