Machine learning for large-scale crop yield forecasting

Dilli Paudel,Hendrik Boogaard,Allard De Wit,Sander Janssen,Sjoukje Osinga,Christos Pylianidis,Ioannis N Athanasiadis

doi:10.1016/j.agsy.2020.103016

Abstract

Many studies have applied machine learning to crop yield prediction with a focus on specific case studies. The data and methods they used may not be transferable to other crops and locations. On the other hand, operational large-scale systems, such as the European Commission's MARS Crop Yield Forecasting System (MCYFS), do not use machine learning. Machine learning is a promising method especially when large amounts of data are being collected and published. We combined agronomic principles of crop modeling with machine learning to build a machine learning baseline for large-scale crop yield forecasting. The baseline is a workflow emphasizing correctness, modularity and reusability. For correctness, we focused on designing explainable predictors or features (in relation to crop growth and development) and applying machine learning without information leakage. We created features using crop simulation outputs and weather, remote sensing and soil data from the MCYFS database. We emphasized a modular and reusable workflow to support different crops and countries with small configuration changes. The workflow can be used to run repeatable experiments (e.g. early season or end of season predictions) using standard input data to obtain reproducible results. The results serve as a starting point for further optimizations. In our case studies, we predicted yield at regional level for five crops (soft wheat, spring barley, sunflower, sugar beet, potatoes) and three countries (the Netherlands (NL), Germany (DE), France (FR)). We compared the performance with a simple method with no prediction skill, which either predicted a linear yield trend or the average of the training set. We also aggregated the predictions to the national level and compared with past MCYFS forecasts. The normalized RMSE (NRMSE) for early season predictions (30 days after planting) were comparable for NL (all crops), DE (all except soft wheat) and FR (soft wheat, spring barley, sunflower). For example, NRMSE was 7.87 for soft wheat (NL) (6.32 for MCYFS) and 8.21 for sugar beet (DE) (8.79 for MCYFS). In contrast, NRMSEs for soft wheat (DE), sugar beet (FR) and potatoes (FR) were twice as much compared to MCYFS. NRMSEs for end of season were still comparable to MCYFS for NL, but worse for DE and FR. The baseline can be improved by adding new data sources, designing more predictive features and evaluating different machine learning algorithms. The baseline will motivate the use of machine learning in large-scale crop yield forecasting.

Highlights

Crop yield prediction is an important but complex problem, neces sary for sustainable intensification and efficient use of natural resources (Phalan et al 2014; Tilman et al 2011)
We evaluated the two sets of pre dictions using mean absolute error (MAE), mean absolute percentage error (MAPE), root mean squared error (RMSE) and the coefficient of determination or R2
MAPE results are included in Section D of Supplement 1, and all results including normalized MAE, normalized RMSE, MAPE and R2 for all case studies, experiments and algorithms are provided in Sup plement 2

Summary

Introduction

Crop yield prediction is an important but complex problem, neces sary for sustainable intensification and efficient use of natural resources (Phalan et al 2014; Tilman et al 2011). Crop yield is influenced by many crop-specific parameters, environmental conditions and management decisions (Fischer 2015), and it is difficult to build a reliable and explainable prediction model. Crop growth models, remote sensing, statistical models and their combinations have been commonly used to predict crop yield. On their own, these methods address slightly different aspects of crop yield forecasting. Recent studies have combined different methods in innovative ways to build yield fore casting models. Lobell et al (2015) and Zhao et al (2020) used high-resolution remote sensing data and crop modeling to build statistical models to forecast the actual yield. Newlands et al (2014) developed a probabilistic yield forecasting framework for Can ada using remote sensing, crop modeling, Bayesian inference and sta tistical models

Methods

Results

Discussion

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Agricultural Systems	Publication Date: Dec 14, 2020
Citations: 132	License type: cc-by

R Discovery Prime

R Discovery Prime

Machine learning for large-scale crop yield forecasting

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Agricultural Systems

Lead the way for us

Similar Papers

Machine learning for regional crop yield forecasting in Europe
Dilli Paudel ... Sander Janssen
Field Crops Research | VOL. 276
Dilli Paudel, et. al.Dilli Paudel ... Sander Janssen
01 Feb 2022
Field Crops Research | VOL. 276

Plants meet machines: Prospects in machine learning for plant biology
Pamela S Soltis ... Gil Nelson
Applications in Plant Sciences | VOL. 8
Pamela S Soltis, et. al.Pamela S Soltis ... Gil Nelson
01 Jun 2020
Applications in Plant Sciences | VOL. 8

Tool Support for Improving Software Quality in Machine Learning Programs
Kwok Sun Cheng ... Pei-Chi Huang
Information | VOL. 14
Kwok Sun Cheng, et. al.Kwok Sun Cheng ... Pei-Chi Huang
16 Jan 2023
Information | VOL. 14

A Primer on Machine Learning.
Audrene S Edwards ... Tun Jie
Transplantation | VOL. 105
Audrene S Edwards, et. al.Audrene S Edwards ... Tun Jie
18 Aug 2020
Transplantation | VOL. 105

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Machine learning for large-scale crop yield forecasting

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Agricultural Systems