Genome-wide selection of discriminant SNP markers for breed assignment in indigenous sheep breeds

Mohammad Hossein Moradi,Mehdi Kazemi-Bonchenari,Mahdi Khodaei-Motlagh,Amir Hossein Khaltabadi-Farahani,John Mcewan

doi:10.2478/aoas-2020-0097

Mohammad Hossein Moradi, Mehdi Kazemi-Bonchenari + Show 3 more

Open Access

https://doi.org/10.2478/aoas-2020-0097

Copy DOI

Journal: Annals of Animal Science	Publication Date: Jul 1, 2021
Citations: 5	License type: CC BY 4.0

Affiliation: Arak University

Abstract

Abstract The assignment of an individual to the true population of origin is one of the most important applications of genomic data for practical use in animal breeding. The aim of this study was to develop a statistical method and then, to identify the minimum number of informative SNP markers from high-throughput genotyping data that would be able to trace the true breed of unknown samples in indigenous sheep breeds. The total numbers of 217 animals were genotyped using Illumina OvineSNP50K BeadChip in Zel, Lori-Bakhtiari, Afshari, Moqani, Qezel and a wild-type Iranian sheep breed. After SNP quality check, the principal component analysis (PCA) was used to determine how the animals allocated to the groups using all genotyped markers. The results revealed that the first principal component (PC1) separated out the two domestic and wild sheep breeds, and all domestic breeds were separated from each other for PC2. The genetic distance between different breeds was calculated using FST and Reynold methods and the results showed that the breeds were well differentiated. A statistical method was developed using the stepwise discriminant analysis (SDA) and the linear discriminant analysis (LDA) to reduce the number of SNPs for discriminating 6 different Iranian sheep populations and K-fold cross-validation technique was employed to evaluate the potential of a selected subset of SNPs in assignment success rate. The procedure selected reduced pools of markers into 201 SNPs that were able to exactly discriminate all sheep populations with 100% accuracy. Moreover, a discriminate analysis of principal components (DAPC) developed using 201 linearly independent SNPs revealed that these markers were able to assign all individuals into true breed. Finally, these 201 identified SNPs were successfully used in an independent out-group breed consisting of 96 samples of Baluchi sheep breed and the results indicated that these markers are able to correctly allocate all unknown samples to true population of origin. In general, the results of this study indicated that the combined use of the SDA and LDA techniques represents an efficient strategy for selecting a reduced pool of highly discriminant markers.

Highlights

Pedigree information is essential for accurate genetic evaluation (Heaton et al, 2014)
The results indicated that the highest and lowest number of remaining single nucleotide polymorphisms (SNP) markers were observed in Qezel and wild-type sheep breeds, respectively
The results of genetic distance statistics showed that the different breeds were well differentiated and their genomic information could be used for determining of discriminant SNP markers

Summary

Introduction

Pedigree information is essential for accurate genetic evaluation (Heaton et al, 2014). Principal component analysis (PCA) has been more recently proposed as an alternative method to determine population informative SNP markers This method has been already used in human populations to characterize their structure based on SNP genotyping data (Paschou et al, 2007) and in cattle to identify breed informative SNPs (Bertolini et al, 2018). The lack of complete pedigrees and misidentification of sires affects the accuracy of genetic evaluation and the efficiency of breeding programs (Tortereau et al, 2017) In this situation developing of a statistical method for selecting the discriminant SNPs for breed assigning in Iranian sheep breeds could be crucial, where no report is available yet. The aim of this study was to identify the minimum number of informative SNPs from high throughput genotyping data for assigning unknown individuals to the true population of origins in Iranian sheep breeds. The results of this study could be further used to develop a low cost customized essay to trace the breed or derived foodstuffs

Objectives

Methods

Results

Discussion

Conclusion