The Hyrcanian forest is a global biodiversity hotspot that harbors many endemic and endangered tree species, but its tree diversity is threatened by various human-induced disturbances, such as logging, grazing, and urbanization. To address this issue, we conducted a study using three machine learning methods, i.e., linear regression (LR), random forest (RF), and support vector machine (SVM), to assess and predict tree species diversity within the forest. To do so, we collected an extensive dataset of forest structure and environmental factors from 2725 sample plots located throughout the forest. The Shannon-Wiener diversity index was used to quantify the tree species diversity for each plot. We found that basal area, tree density, and height of trees were the most important predictors of tree diversity, followed by diameter at breast height, elevation, slope, and aspect. We measured the performance of the models using the coefficient of determination (R2), root mean square error (RMSE), and percent of relative error index (PREI), and found RF as the best-performing model in both the training (RMSE = 0.143, R2 = 0.94, and PREI = - 0.09) and validation (RMSE = 0.15, R2 = 0.94, and PREI = - 0.09) phases. RF was able to generalize effectively to new data without losing much accuracy or explanatory power. SVM demonstrated a moderate performance training (training phase: RMSE = 0.23, R2 = 0.57, and PREI = - 0.17) and (validation phase: RMSE = 0.36, R2 = 0.34, and PREI = - 0.21) among the models, while LR performed the worst (training phase: RMSE = 0.41, R2 = 0.13, and PREI = - 0.19) and (validation phase: RMSE = 0.41, R2 = 0.11, and PREI = - 0.36). These findings have broad applications beyond this specific region and can contribute to promoting sustainable land use practices and conservation efforts in other ecosystems facing similar challenges.
Read full abstract