Scientists have estimated that global crop production needs to double by 2050 to supply the demand for food, feed, and fuel. To reach this goal, novel methods are needed to increase breeding potential yield rates of gain as well as on-farm yields through enhanced management strategies. Both of these tasks require the ability to predict plant performance in multiple, dynamic environments based on a knowledge of cultivar characteristics (critical short day lengths, maximum leaf photosynthetic rates, pod fill durations, etc.) that are ultimately linked to genetics. Because of this linkage, we refer to such traits as genotype-specific parameters (GSP's). Using industry-provided yield and weather data from 353 site-years, we estimated seven primary CROPGRO-Soybean GSP's for each of 182 varieties. The data set had two shortcomings. First, no planting dates were supplied, rendering unknowable the environment actually experienced by the crop. Second, soil data were provided only for the top 20cm, which is inadequate to specify the root environment and water supply availability. Therefore, additional edaphic information was acquired. A novel optimization algorithm was developed that simultaneously estimates GSP's and planting dates, while tuning layered soil water-holding properties. The optimizer, which we have named the holographic genetic algorithm (HGA), uses both externally supplied constraints and its own analysis of data structure to reduce what would otherwise be a search over 2000 dimensions to a much smaller number of overlapping 1- to 3-D problems. Two types of runs were performed. The first was preceded by an independent component analysis (ICA) of published GSP's. The subsequent training sought good component scores rather than the GSP's themselves. The second, separate factor (SF) approach allowed all GSP's to vary separately. This makes parameters unconstrained and more evenly distributed. Results showed that HGA works quite well with the CROPGRO-Soybean model to estimate the cultivar and site-specific parameters from breeding trial data. The quality of the calibrations and evaluations were similar across both run types with RMSE values being ca. 5.6% of the maximum yields. Moreover, the GSP's for a variety can be used to predict its yield in trials not used in that cultivar's calibration. Finally, despite high dimensionality, the GSP's, planting dates, and soil properties for all lines and sites converged concurrently in <58 iterations, demonstrating great utility for use with big data sets.
Read full abstract