Abstract

Including all available data when developing equations to relate midinfrared spectra to a phenotype may be suboptimal for poorly represented spectra. Here, an alternative local changepoint approach was developed to predict six milk technological traits from midinfrared spectra. Neighbours were objectively identified for each predictand as those most similar to the predictand using the Mahalanobis distances between the spectral principal components, and subsequently used in partial least square regression (PLSR) analyses. The performance of the local changepoint approach was compared to that of PLSR using all spectra (global PLSR) and another LOCAL approach, whereby a fixed number of neighbours was used in the prediction according to the correlation between the predictand and the available spectra. Global PLSR had the lowest RMSEV for five traits. The local changepoint approach had the lowest RMSEV for one trait; however, it outperformed the LOCAL approach for four traits. When the 5% of the spectra with the greatest Mahalanobis distance from the centre of the global principal component space were analysed, the local changepoint approach outperformed the global PLSR and the LOCAL approach in two and five traits, respectively. The objective selection of neighbours improved the prediction performance compared to utilising a fixed number of neighbours; however, it generally did not outperform the global PLSR.

Highlights

  • Fourier transform midinfrared spectroscopy (MIRS) is a non-disruptive technique, routinely used in the analysis of both bulk tank and individual animal milk samples to quantify the fat, protein, lactose, and casein concentration [1]

  • We developed a novel local approach, called the local changepoint approach, and applied it to a data set of milk spectral data to predict a suite of milk technological traits

  • Whereby the points used to predict a target spectrum are specially selected based on their similarity to the predictand, have the potential to improve the prediction performance over global Partial least squares regression (PLSR) prediction in heterogeneous datasets and when non-linear associations are present between the spectra and the trait of interest [4,11]

Read more

Summary

Introduction

Fourier transform midinfrared spectroscopy (MIRS) is a non-disruptive technique, routinely used in the analysis of both bulk tank and individual animal milk samples to quantify the fat, protein, lactose, and casein concentration [1]. Partial least squares regression (PLSR) analysis [2,3] is the principal statistical method used to relate MIRS spectral data to a trait, as PLSR can handle collinear and high-dimensional datasets. The use of PLSR to relate milk spectral data to various milk and animal traits has been challenged by alternative machine-learning methods [5,6,7]. When non-linearity between the spectra and the phenotype is present, methods such as artificial neural networks (ANN) [8,9], support vector machines [10], and local approaches [11] have been applied. As reviewed by Perez-Marin et al [12], aim to identify the most similar spectra to the target spectrum (i.e., the predictand) and use only these similar spectra (termed neighbours) for phenotype prediction. Identifying and using the most similar spectra to a predictand can potentially improve the accuracy of the PLSR prediction [13,14,15]

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call