Abstract
Deep learning comes with a portfolio of highly flexible models, known as neural networks (NNs), capable of solving various problems and setting new high standards for prediction accuracy. Nevertheless, whether NNs could be of value in aquaculture selective breeding settings is unclear as the whole topic is underexplored. Furthermore, fine-tuning a plethora of hyperparameters before fitting a neural network is a daunting task. Using simulated and a publicly available dataset on genetic resistance in carp against koi herpes virus (KHV), various neural network architectures were benchmarked against commonly used animal breeding models. More specifically, the simulated datasets comprised 36000 phenotyped animals genotyped for 54000 single nucleotide polymorphisms (SNPs). In contrast, the carp dataset included 1255 carp juveniles with survival recordings for KHV that were genotyped for 15615 SNPs. The assessed NN architectures included multilayer perceptrons (MLPs), convolution neural networks (CNNs) and local convolution neural networks (LCNNs). In addition, the effect of various hyperparameters of neural networks, such as the number of hidden layers, neurons per layer, activation function, learning rate, batch size, and regularisation techniques like dropout, were examined. In the simulated datasets, fully connected models with 5 hidden layers and 100 neurons per layer performed slightly better (1 – 4 %) than ridge-regression best linear unbiased prediction (rrBLUP), while the CNNs gave the lowest prediction accuracies (∼ 14 % lower than MLPs) and the ones from LCNN in between the above (∼ 8 lower than MLPs). Nevertheless, the estimated breeding values from NNs appeared more biased than rrBLUP (mean regression slope of 1.2 for the NN with the highest prediction accuracy vs 1.08). A reverse picture was observed in the case of the carp dataset, where following the application of receiver operating characteristic (ROC) curves, the animal breeding models outperformed neural networks by more than 2 % (based on the area under the curve index). In this case the LCNN had the highest area under the curve index from all NNs. Overall, NNs could be valuable tools in aquaculture breeding programs, though large training datasets of tens of thousands or more of phenotyped and genotyped animals seem to be required.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have