Abstract
Breed identification utilizing multiple information sources and methods is widely applicated in the field of animal genetics and breeding. Simultaneously, with the development of artificial intelligence, the integration of high-throughput genomic data and machine learning techniques is increasingly used for breed identification. In this context, we used 654 individuals from 15 pig breeds, evaluating the performance of machine learning and stacking ensemble learning classifiers, as well as the function of feature selection and anomaly detection in different scenarios. Our results showed that, when using a training set of 16 individuals per breed and 32 features (SNPs), the accuracy of breed identification with feature selection (eXtreme Gradient Boosting, XGBoost) could exceed 95.00% (nine breeds), and was improved by 7.04% over the results with random selection. For stacking ensemble learning, feature selection methods (including random selection method) were used before different base learners. When these base learners' training set had 16 individuals per breed and 32 features, the accuracy of stacking ensemble learning improved by 9.24% over the best base learner (nine breeds), but did not significantly increase the advantage over the models with XGBoost feature selection. When using a training set of 16 individuals and 512 features per breed, breed identification with anomaly detection (local outlier factor, LOF) and random selection could achieve an accuracy of 89.06% (15 breeds). These results show that machine learning could be an effective tool for breed identification and this study will also provide useful information for the application of machine learning in animal genetics and breeding.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.