Making plant breeding programs less expensive, fast, practical, and accurate, especially for soybeans, promotes the selection of new soybean genotypes and contributes to the emergence of new varieties that are more efficient in absorbing and metabolizing nutrients. Using spectral information from soybean genotypes combined with nutritional information on secondary macronutrients can help genetic improvement programs select populations that are efficient in absorbing and metabolizing these nutrients. In addition, using machine learning algorithms to process this information makes the acquisition of superior genotypes more accurate. Therefore, the objective of the work was to verify the classification performance of soybean genotypes regarding secondary macronutrients by ML algorithms and different inputs. The experiment was conducted in the experimental area of the Federal University of Mato Grosso do Sul, municipality of Chapadão do Sul, Brazil. Soybean was sown in the 2019/20 crop season, with the planting of 103 F2 soybean populations. The experimental design used was randomized blocks, with two replications. At 60 days after crop emergence (DAE), spectral images were collected with a Sensifly eBee RTK fixed-wing remotely piloted aircraft (RPA), with autonomous takeoff control, flight plan, and landing. At the reproductive stage (R1), three leaves were collected per plant to determine the macronutrients calcium (Ca), magnesium (Mg), and sulfur (S) levels. The data obtained from the spectral information and the nutritional values of the genotypes in relation to Ca, Mg, and S were subjected to a Pearson correlation analysis; a PC analysis was carried out with a k-means algorithm to divide the genotypes into clusters. The clusters were taken as output variables, while the spectral data were used as input variables for the classification models in the machine learning analyses. The configurations tested in the models were spectral bands (SBs), vegetation indices (VIs), and a combination of both. The combination of machine learning algorithms with spectral data can provide important biological information about soybean plants. The classification of soybean genotypes according to calcium, magnesium, and sulfur content can maximize time, effort, and labor in field evaluations in genetic improvement programs. Therefore, the use of spectral bands as input data in random forest algorithms makes the process of classifying soybean genotypes in terms of secondary macronutrients efficient and important for researchers in the field.
Read full abstract