In South Asia, wheat is typically grown in favorable environments, although policies promoting intensification in Bangladesh's stress-prone coastal zone have resulted in expanded cultivation in this non-traditional area. Relatively little is known about how to best manage wheat in these unique environments. Research is thus needed to identify ‘best-bet’ entry points to optimize productivity, but classical parametric analyses offer limited applicability to elucidate the relative importance of the multiple factors and interactions that influence yield under such conditions. This problem is most evident in datasets derived from farmer-participatory research, where missing values and skewed data are common. This paper examines the predictive power of three non-parametric approaches, including linear mixed effects models (LMMs), and two binary recursive partitioning methods: classification and regression trees (CARTs) and Random Forests. We collected yield, crop management, and environmental observations from 422 wheat fields in the 2012–13 season, across six production environments spanning southern Bangladesh, where nutrient rates and genotypes were imposed, but management of other production factors varied from farmer to farmer. Fields were grouped into categories including early- and late-sowing, depending on crop establishment before or after December 15, respectively, and in combination, across both early- and late-sowing groups. For each of these groups, we investigated how each non-parametric analysis predicted the factors influencing yield. All three approaches identified nitrogen rate and environment as the most important factors, regardless of sowing category. CART also identified assemblages of high- and low-yielding environments, although those located in saline and warmer thermal zones were not necessarily the lowest yielding, indicating that farmers can optimize crop management to overcome these constraints. The number of days farmers sowed wheat before or after December 15, days to maturity, and the number of irrigations and weedings also influenced yield, though each method weighted these factors differently. LMMs also indicated a slight yield advantage when farmers used stress-tolerant genotypes, though CART and Random Forests did not. One-to-one plots for observed vs. predicted yields from LMMs and Random Forests showed better performance by the former than the latter, with smaller root mean square and mean absolute error for the combined, early- and late-sowing groups, respectively. While the LMMs were superior in this case, Random Forests may still prove useful in the classification and interpretation of farm survey data in which no treatment interventions have been administered.
Read full abstract