Statistical crop modeling is pivotal for understanding climate impacts on crop yields. Choices of models matter: Linear regression is interpretable but limited in predictive power; machine learning predicts well but often remains a black box. To develop explainable artificial intelligence (AI) for exploring historical crop yield data and predicting crop yield, here we reported a Bayesian ensemble model (BM) that is interpretable with great explanatory and predictive power. BM embraces many competitive models via Bayesian model averaging, fits complex functions, and quantifies model uncertainty. Long-term crop yields are driven by both climate and technology; the common practice of first detrending and then analyzing the detrended data has an incorrigible bias. Therefore, BM was also aimed at decomposing historical yield data to jointly estimate technological trends and climate effects on crop yield. We compared BM with ElasticNet, Neural Network, MARS, SVM, Random Forests, and XGBoost. BM excelled at both predicting and explaining. When tested on synthetic data, BM was the only method unveiling the true relationships: BM has stronger interpretability; other methods predicted well but for wrong reasons. When tested on maize yield data in Ohio, BM detected two technological shifts, attributable to hybrid corn adoption in the 1940′s and the technological slowing-down in the 1970′s: No other methods detected such changepoints. BM derived nonlinear asymmetric crop responses to climate and non-negligible temperature-precipitation interacting effects, with patterns consistent with theoretical or experimental evidence. Extrapolation of all the models for future yield prediction was highly uncertain, but BM provided more reliable predictions under novel climate whereas Random Forests and XGBoost proved unsuitable for extrapolation. Overall, BM provided new insights unattainable by the existing black-box methods. We caution against blind use of black-box machine learning for statistical crop modeling and call for more efforts to apply interpretable machine learning for mechanistic understandings of crop-climate interactions.
Read full abstract