Abstract

Crop yield prediction before the harvest is crucial for food security, grain trade, and policy making. Previously, several machine learning methods have been applied to predict crop yield using different types of variables. In this study, we propose using the Geographically Weighted Random Forest Regression (GWRFR) approach to improve crop yield prediction at the county level in the US Corn Belt. We trained the GWRFR and five other popular machine learning algorithms (Multiple Linear Regression (MLR), Partial Least Square Regression (PLSR), Support Vector Regression (SVR), Decision Tree Regression (DTR), and Random Forest Regression (RFR)) with the following different sets of features: (1) full length features; (2) vegetation indices; (3) gross primary production (GPP); (4) climate data; and (5) soil data. We compared the results of the GWRFR with those of the other five models. The results show that the GWRFR with full length features (R2 = 0.90 and RMSE = 0.764 MT/ha) outperforms other machine learning algorithms. For individual categories of features such as GPP, vegetation indices, climate, and soil features, the GWRFR also outperforms other models. The Moran’s I value of the residuals generated by GWRFR is smaller than that of other models, which shows that GWRFR can better address the spatial non-stationarity issue. The proposed method in this article can also be potentially used to improve yield prediction for other types of crops in other regions.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call