Abstract

BackgroundThe algorithm for proven and young (APY) has been suggested as a solution for recursively computing a sparse representation for the inverse of a large genomic relationship matrix (G). In APY, a subset of genotyped individuals is used as the core and the remaining genotyped individuals are used as noncore. Size and definition of the core are relevant research subjects for the application of APY, especially given the ever-increasing number of genotyped individuals.MethodsThe aim of this study was to investigate several core definitions, including the most popular animals (MPA) (i.e., animals with high contributions to the genetic pool), the least popular males (LPM), the least popular females (LPF), a random set (Rnd), animals evenly distributed across genealogical paths (Ped), unrelated individuals (Unrel), or based on within-family selection (Fam), or on decomposition of the gene content matrix (QR). Each definition was evaluated for six core sizes based on prediction accuracy of single-step genomic best linear unbiased prediction (ssGBLUP) with APY. Prediction accuracy of ssGBLUP with the full inverse of G was used as the baseline. The dataset consisted of 357k pedigreed Duroc pigs with 111k pigs with genotypes and ~ 220k phenotypic records.ResultsWhen the core size was equal to the number of largest eigenvalues explaining 50% of the variation of G (n = 160), MPA and Ped core definitions delivered the highest average prediction accuracies (~ 0.41−0.53). As the core size increased to the number of eigenvalues explaining 99% of the variation in G (n = 7320), prediction accuracy was nearly identical for all core types and correlations with genomic estimated breeding values (GEBV) from ssGBLUP with the full inversion of G were greater than 0.99 for all core definitions. Cores that represent all generations, such as Rnd, Ped, Fam, and Unrel, were grouped together in the hierarchical clustering of GEBV.ConclusionsFor small core sizes, the definition of the core matters; however, as the size of the core reaches an optimal value equal to the number of largest eigenvalues explaining 99% of the variation of G, the definition of the core becomes arbitrary.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.