Genotype imputation for soybean nested association mapping population to improve precision of QTL detection

Linfeng Chen,James E Specht,Brian W Diers,Rouf Mian,Charles Quigley,Earl Taliercio,Susan Araya,Qijian Song,Shouping Yang

doi:10.1007/s00122-022-04070-7

Linfeng Chen, James E Specht + Show 7 more

Open Access

PDF Available

https://doi.org/10.1007/s00122-022-04070-7

Copy DOI

Export

Save

Cite

Abstract
Highlights/Summary
Full-Text PDF
Similar Papers

Abstract

Listen

Key messageSoftware for high imputation accuracy in soybean was identified. Imputed dataset could significantly reduce the interval of genomic regions controlling traits, thus greatly improve the efficiency of candidate gene identification.Genotype imputation is a strategy to increase marker density of existing datasets without additional genotyping. We compared imputation performance of software BEAGLE 5.0, IMPUTE 5 and AlphaPlantImpute and tested software parameters that may help to improve imputation accuracy in soybean populations. Several factors including marker density, extent of linkage disequilibrium (LD), minor allele frequency (MAF), etc., were examined for their effects on imputation accuracy across different software. Our results showed that AlphaPlantImpute had a higher imputation accuracy than BEAGLE 5.0 or IMPUTE 5 tested in each soybean family, especially if the study progeny were genotyped with an extremely low number of markers. LD extent, MAF and reference panel size were positively correlated with imputation accuracy, a minimum number of 50 markers per chromosome and MAF of SNPs > 0.2 in soybean line were required to avoid a significant loss of imputation accuracy. Using the software, we imputed 5176 soybean lines in the soybean nested mapping population (NAM) with high-density markers of the 40 parents. The dataset containing 423,419 markers for 5176 lines and 40 parents was deposited at the Soybase. The imputed NAM dataset was further examined for the improvement of mapping quantitative trait loci (QTL) controlling soybean seed protein content. Most of the QTL identified were at identical or at similar position based on initial and imputed datasets; however, QTL intervals were greatly narrowed. The resulting genotypic dataset of NAM population will facilitate QTL mapping of traits and downstream applications. The information will also help to improve genotyping imputation accuracy in self-pollinated crops.

Highlights

In modern breeding programs, germplasm is frequently required to be genotyped with mega- or giga-sized sets of single nucleotide polymorphism (SNP) markers
The objectives of this study were to evaluate imputation performance of the three commonly used imputation software, BEAGLE, IMPUTE and AlphaPlantImpute in soybean populations considering a number of factors including the number of markers in the study panel, extent of linkage disequilibrium (LD), minor allele frequency (MAF) of markers and genetic map distance vs. physical distance, to generate soybean Nested association mapping (NAM) recombinant inbred line (RIL) imputed genotype dataset with optimized software parameters for public utilization and to demonstrate quantitative trait loci (QTL) mapping improvement based on the imputed RILs dataset vs. original dataset in linkage mapping analysis
For imputation of 5 and 160 markers per chromosome in study panels performed by BEAGLE 5.0, the accuracy increased by 11.30% and 1.04% when filtered with genotype probability (GP) > 0.9 versus without GP filtering

Summary

Introduction

Germplasm is frequently required to be genotyped with mega- or giga-sized sets of single nucleotide polymorphism (SNP) markers. BEAGLE 5.0 uses haplotype frequency model described by Li and Stephens (2003) with a highly parsimonious algorithm to construct a small subset of reference haplotype from a full reference panel for imputation, which enables to the use of large reference panels with a significant reduction in computational cost in imputation (Browning et al 2018) It is a more computationally intensive imputation method, the current version of IMPUTE 5 is greatly improved in speed, accuracy and memory efficiency by using new reference panel file format and haplotype-selecting strategy based on the Positional Burrows Wheeler Transform (PBWT) (Rubinacci et al 2019). Soybean is an inbred crop with relatively low genetic diversity and a long stretch of related haplotypes, especially in the bi-parental derived populations Both software models have been widely used in animal and plant genetics, parameters affecting the size of haplotype cluster in the study panel of inbred plant like soybean need to be optimized. Other tools designed to integrate GBS data from bi-parental populations in plants, including Tassel-FSFHap (Swarts et al 2014), LB-impute (Fragoso et al 2016), and NOISYmputer (Lorieux et al 2019), are available

Objectives

Methods

Results

Discussion

Conclusion

Full Text

Published Version (Free)

View/Download pdf

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Theoretical and Applied Genetics	Publication Date: Mar 11, 2022
Citations: 4	License type: open-access

R Discovery Prime

Genotype imputation for soybean nested association mapping population to improve precision of QTL detection

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers

More From: Theoretical and Applied Genetics

Lead the way for us

Similar Papers

Imputation accuracy to whole-genome sequence in Nellore cattle
Gerardo A Fernandes Júnior ... Roy Costilla
Genetics Selection Evolution | VOL. 53
Gerardo A Fernandes Júnior, et. al.Gerardo A Fernandes Júnior ... Roy Costilla
12 Mar 2021
Genetics Selection Evolution | VOL. 53

Impact of QTL minor allele frequency on genomic evaluation using real genotype data and simulated phenotypes in Japanese Black cattle.
Yoshinobu Uemoto ... Yoshikazu Sugimoto
BMC Genetics | VOL. 16
Yoshinobu Uemoto, et. al.Yoshinobu Uemoto ... Yoshikazu Sugimoto
19 Nov 2015
BMC Genetics | VOL. 16

Imputation of genotypes from low density (50,000 markers) to high density (700,000 markers) of cows from research herds in Europe, North America, and Australasia using 2 reference populations
J.E Pryce ... M.P.L Calus
Journal of Dairy Science | VOL. 97
J.E Pryce, et. al.J.E Pryce ... M.P.L Calus
25 Jan 2014
Journal of Dairy Science | VOL. 97

Accuracy of high-density genotype imputation in Japanese Black cattle.
Y Uemoto ... T Watanabe
Animal Genetics | VOL. 46
Y Uemoto, et. al.Y Uemoto ... T Watanabe
07 Jul 2015
Animal Genetics | VOL. 46

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

Genotype imputation for soybean nested association mapping population to improve precision of QTL detection

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers

More From: Theoretical and Applied Genetics