Abstract

It is effective to accurately discriminate the sex of silkworm pupae with the same varieties based on near infrared spectroscopy. However, when the model is promoted to classify new varieties of silkworm pupae, the model’s performance becomes worse, due to the cultivation environment and varieties changing. In the aims of improving the generalization ability and accuracy of the model, this paper proposed a model updating strategy based on semi-supervised learning. First, support vector machine identification model was built after the original spectra was pretreated by Savitzky-Golay convolution smoothing operation, which could effectively reduce spectra noise. Then, the support vector machine model gave the pre-labelings of unlabeled silkworm pupae in the updated set, which were divided into male samples and female samples. According to the correlation coefficients that calculated by Pearson correlation coefficient and Euclidean distance, a total of 8 reliable samples were selected from the male and female samples, respectively. The reliable samples were added to the original training set to update the original model. Finally, the updated model was used to test the test sets from the varieties of silkworm pupae that were the same with updated sets.The results showed the performance of the non-updated model for silkworm pupae from the three new varieties just reached 54.55%, 68.52%, 86.84%, respectively. The support vector machine model updated by using Pearson correlation coefficient improved the accuracy to 100%, 96.30%, 97.37%, and the model updated by Euclidean distance increased the identification accuracy of the three varieties that were not involved in the modeling to 100%, 75.93%, 92.10% respectively. The results showed that the performance of the model updated by Pearson correlation coefficient was better than Euclidean distance. The results revealed that the method based on semi-supervised learning could effectively solve the problem of poor universality for sex identification model.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call