Abstract Measurement of individual dry matter intake (DMI) is critical to improving the management of the beef herd and is necessary to determine the feed efficiency of individual cattle. Combining machine learning models and proxies for DMI (e.g., water intake, body weight, climatic variables) has the potential to replace the need for dedicated feed intake equipment. It would also open the possibility of measuring DMI in situations like grazing where quantifying DMI directly is not possible. In this study, we examined the relationship between DMI, animal variables and environmental variables. In this study, we developed and tested machine learning models based boosting techniques to develop approaches to predict DMI in cattle, using the same data we have previously published results from Random Forest techniques in the past. The dataset included both bulls and steers. The developed models include both regular boosting regression models and random effects boosting models. The regular boosting technique models such as gradient boosting regression, light gradient boosting regression, extreme gradient boosting regression. Similarly, random effects boosting machine learning model Gaussian process boosting (GPBoost) was developed to consider the random effects associated with the intake behavior of the animals. The extreme gradient boosting model outperformed other regular boosting regression models with a r2 of 0.81 and 0.48 on training and testing datasets respectively. However, it does not consider the random effects associated with the longitudinal nature of the animal intake and weather variables and prone to overfitting. GPBoost model variants were tested and developed to avoid overfitting and to incorporate a random effects model, which we propose as the overall best performing model. The GPBoost generalized well on the training and testing datasets with r2 scores of 0.55 and 0.54 respectively. To further understand the model’s performance on the unseen dataset, we used SHAP analysis on the GPBoost. The SHAP analysis revealed that water intake, full body weight and age of the animals contributed significantly to predicting unseen data.
Read full abstract