It is of critical importance to understand the relationships between crop yield, soil properties and topographic characteristics for agricultural management. This study’s objective was to compare techniques to quantify the relationship between soil and topographic characteristics for predicting crop yield using high-resolution data and analytical techniques. The study was conducted on a multiple field dataset located in Southwestern Ontario, Canada, where few studies have assessed the impact of applications for precision agriculture and machine learning (ML) to the soil property-yield relationship in this region. The dataset included 145,500 observations of corn and soybean yield, topographic and soil nutrient characteristics. The attributes considered for this study included pH, soil organic matter (OM) content, cation exchange capacity (CEC), soil test phosphorus, zinc (Zn), potassium (K), elevation and topographic wetness index. Multiple linear regression (MLR), artificial neural networks, decision trees and random forests were compared to identify methods able to relate soil properties and crop yields on a subfield scale (2 m). Random forests were the most successful at predicting yield with an R2 value of 0.85 for corn and 0.94 for soybeans. MLR was the least successful with an R2 of 0.40 for corn and 0.45 for soybeans. Cross-validation experiments showed that random forest models in most cases could predict low- and high-yield areas from fields excluded from training datasets, but this was not possible in all cases. Techniques tested the models and identified significant soil and topographic attributes when predicting yield, though the identification was subject to some uncertainty. These results suggest that ML techniques might be used to predict high yield areas of fields without existing yield maps, if those fields have similar relationships of soil properties to yield.
Read full abstract