Abstract

Background: Identification of suitable factors that influence significantly to the response is crucial for the traits based breeding program to make a better decision about improvement in productivity. Multiple linear regression (MLR) is the benchmark method commonly using to identify suitable factors for crop improvement. It doesn’t work always due to stringent assumption (Multicollinearity, Linearity) behind the MLR model. Here we tried to develop an efficient model for the selection of major traits that contribute to seed yield in soybean by comparing different models.Methods: Field experiment was conducted using 98 soybean core population through augmented design.18 morphometric traits obtain from soybean core population were considered under the study as regressors.Multiple linear regression (MLR), Principle component Regression (PCR), Regression tree and Random Forest models were compared to select traits based on prediction accuracy.Result: All the models identified the number of pods per plant (NPP) has the most influencing variable to the soybean yield. However random forest has a much higher prediction power (RMSE=4.59, MAPE=0.18) compared to other models under study. The results of random forest revealed that the number of pods per plant, number of branches per plant and other associated characters like plant height at harvest as highly influencing traits for seed yield in soybean.Finally, tried to identify genotypesthat possess superiority about most influencing morphological characters on seed yield using cluster analysis.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call