Abstract

We built machine learning and image analysis tools in order to forecast winter wheat yield based on a rich multi dimensional tensor of agricultural information spanning different scales. This information consists of satellite multi-band images, local soil samples obtained from national databases, local weather as well as field data from 23 farms cultivating winter wheat in southern Sweden. This is inherently a large multi-scale problem due to the large temporal and spatial variation of the input data. We aggregate the data on spatially averaged features over grids which temporally span a seasonal timeline from seeding to harvest. Data cleaning is performed through interpolation for satellite images due to cloud obstructions.Furthermore data is heavily imbalanced since the amount of satellite information far exceeds that of the ground data. Data variance therefore can be an issue which we counter by using a decision tree approach. We find that the Light Gradient Boosting decision tree trained on 262 input features is able to predict winter wheat yield with 82% accuracy.Subsequently we employ game theory in order to better understand the relational importance of specific input features towards forecasting yield. Specifically we find that some of the most important features towards the resulting predictions are the percent clay and magnesium in the soil. Similarly the most important features from the satellite data are: a) the NORM index (Euclidean distance of all bands) computed in the second week of April, b) the NORM index computed in the middle of May as well as c) the second spectral band from the last week of June.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call