Abstract The objectives of this study were to develop sub-models for predicting protein requirements and supply, encompassing 1) net protein for maintenance (NPm), 2) lactation (NPl), 3) rumen undegradable protein (RUP), and 4) duodenal microbial nitrogen (MicN) from the feed protein. The dataset used in this study was constructed by integrating in vivo experimental data collected from open databases (the National Animal Nutrition Program) and articles (Journal of Dairy Science), which includes a total of 1,779 observations from 436 publications. In the development of the model, animal information and feed chemical components were used as candidate variables, and two types of machine learning algorithms, Random Forest Regression (RFR) and Support Vector Regression (SVR) were employed. After testing, the following predictors were selected for predicting: 1) NPm: body weight (BW) and dry matter intake (DMI), 2) NPl: BW, DMI, days in milk (DIM), and dietary organic matter (OM) and crude protein (CP) contents, 3) RUP: DIM, DMI, dietary DM content, and CP fraction intake (B and C), and 4) MicN: DIM, DMI, DM, dietary neutral detergent fiber (NDF) content, CP fraction intake (A, B, and C). The selected models were assessed using a cross-validation method with the following statistical metrics including the coefficient of determination (R2), root-mean-square error of prediction (RMSEP), residual analyses, and concordance correlation coefficient (CCC). For the RUP and MicN models, they were compared with the NASEM (2021) model. In predicting NPm, both SVR and RFR algorithms demonstrated increased precision (R2 = 0.965 vs. 0.969) and accuracy (RMSEP = 9.7 vs. 9.2 g/d); however, during residual analysis, the RFR model showed a statistically significant slope bias (P < 0.05). In the NPl prediction, the RFR algorithm showed slightly greater performance compared with SVR (R2 = 0.864 vs. 0.814 and RMSEP = 86.7 vs. 98.5 g/d). However, similar to the NPm prediction, the RFR model displayed a statistically significant slope bias (P < 0.05). In the supply model, the RFR model exhibited the greatest precision and accuracy in predicting RUP (R2 = 0.60, RMSEP = 0.326 kg/d, and CCC = 0.71) without any biases. This model achieved a 2.37-fold increase in R2 and a decrease of 0.111 kg/d in RMSEP compared with the NASEM model. As for MicN prediction, the SVR model performed the best (R2 = 0.76, RMSEP = 42.4 g/d, and CCC = 0.86) without biases. This model attained a 19-times improvement in R2 and a reduction of 38.7 g/d in RMSEP when compared with the NASEM model. In conclusion, the models developed using machine learning algorithms can be helpful for accurately and precisely predicting protein requirements and supply based on animal information and the chemical composition of feed.
Read full abstract