Abstract

Soil seed banks are pivotal for understanding plant population life histories and elucidating plant-environment relationships. Although experimental exploration of persistent seed banks yields valuable insights, it is often time-consming, labor-intensive, and costly. Consequently, the development of efficient computational methods like machine learning techniques is crucial to augment traditional fieldwork and enhance our capacity for effective seed bank data analysis and interpretation. In this study, we collated a dataset of 398 observations from the TRY database, encompassing measurements such as seed thickness, seed length, seed width, seed dry mass, and seed bank type. Notably, some species' seed bank types within this dataset are identified as having a soil seed bank, without distinguishing between persistent or transient seed banks. Our primary objective was to utilize this data to predict whether a species' soil seed bank is persistent or transient. To this end, we evaluated the performance of four classic machine learning methods—Adaptive Boosting (Adaboost), Artificial Neural Network (ANN), Random Forest (RF), and Support Vector Machine (SVM)—in predicting seed bank types. We gauged the performance of each model using five metrics: Accuracy, AUC (Area Under the Curve), kappa, sensitivity, and specificity. Among the four models examined, the ANN model proved to be the most effective in predicting the persistence of soil seed banks, suggesting its efficacy in predicting seed bank types based on a limited number of independent variables. This study underscores the potential of machine learning methods in soil seed bank research and paves the way for future investigations. It can potentially aid in crafting conservation strategies by enhancing our understanding of plant population dynamics.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call