Why choose Random Forest to predict rare species distribution with few samples in large undersampled areas? Three Asian crane species models provide supporting evidence.

Chunrong Mi,Lijia Wen,Xuesong Han,Falk Huettmann,Yumin Guo

doi:10.7717/peerj.2849

Chunrong Mi, Lijia Wen + Show 3 more

Open Access

https://doi.org/10.7717/peerj.2849

Copy DOI

Journal: PeerJ	Publication Date: Jan 12, 2017
Citations: 195	License type: CC BY 4.0

Affiliation: Beijing Forestry University, University of Alaska Fairbanks

Abstract

Species distribution models (SDMs) have become an essential tool in ecology, biogeography, evolution and, more recently, in conservation biology. How to generalize species distributions in large undersampled areas, especially with few samples, is a fundamental issue of SDMs. In order to explore this issue, we used the best available presence records for the Hooded Crane (Grus monacha, n = 33), White-naped Crane (Grus vipio, n = 40), and Black-necked Crane (Grus nigricollis, n = 75) in China as three case studies, employing four powerful and commonly used machine learning algorithms to map the breeding distributions of the three species: TreeNet (Stochastic Gradient Boosting, Boosted Regression Tree Model), Random Forest, CART (Classification and Regression Tree) and Maxent (Maximum Entropy Models). In addition, we developed an ensemble forecast by averaging predicted probability of the above four models results. Commonly used model performance metrics (Area under ROC (AUC) and true skill statistic (TSS)) were employed to evaluate model accuracy. The latest satellite tracking data and compiled literature data were used as two independent testing datasets to confront model predictions. We found Random Forest demonstrated the best performance for the most assessment method, provided a better model fit to the testing data, and achieved better species range maps for each crane species in undersampled areas. Random Forest has been generally available for more than 20 years and has been known to perform extremely well in ecological predictions. However, while increasingly on the rise, its potential is still widely underused in conservation, (spatial) ecological applications and for inference. Our results show that it informs ecological and biogeographical theories as well as being suitable for conservation applications, specifically when the study area is undersampled. This method helps to save model-selection time and effort, and allows robust and rapid assessments and decisions for efficient conservation.

Highlights

Species distribution models (SDMs) are empirical ecological models that relate species observations to environmental predictors (Guisan & Zimmermann, 2000; Drew, Wiersma & Huettmann, 2011)
For the four SDMs technique, our results showed that the area under the ROC curve (AUC) values for Random Forest were always highest (>0.625), ranking this model in first place, followed by Maxent (>0.558), and either CART or TreeNet (≥0.500)
true skill statistic (TSS) showed us consistent results, as was the case for AUC, and Random Forest performed the best (>0.250) followed by Maxent (>0.137) for all three crane species, CART took the third place for Black-necked Cranes, and TreeNet performed better than CART for White-naped Cranes

Summary

Introduction

Species distribution models (SDMs) are empirical ecological models that relate species observations to environmental predictors (Guisan & Zimmermann, 2000; Drew, Wiersma & Huettmann, 2011). To generalize and infer from a model, or model transferability is defined as geographical or temporal cross-applicability of models (Thomas & Bovee, 1993; Kleyer, 2002; Randin et al, 2006) It is one important feature in SDMs, a base-requirement in several ecological and conservation biological applications (Heikkinen, Marmion & Luoto, 2012). Detailed distribution data for rare species in large areas are rarely available in SDMs (Pearson et al, 2007; Booms, Huettmann & Schempf, 2010). They are among the most needed for their conservation to be effective. Collecting and assembling distribution data for species, especially for rare or endangered species in remote wilderness areas is often a very difficult task, requiring a large amount of human, time and funding sources (Gwena et al, 2010; Ohse et al, 2009)

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Why choose Random Forest to predict rare species distribution with few samples in large undersampled areas? Three Asian crane species models provide supporting evidence.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PeerJ

Lead the way for us

Similar Papers

Pre-migratory congregations of Red-crowned (Grus japonensis), White-naped (G. vipio) and Hooded (G. monacha) cranes in the Muraviovka Park for Sustainable Land Use in 1992
P I Gorlov ... A V Matsyura
Biosystems Diversity | VOL. 25
P I Gorlov, et. al.P I Gorlov ... A V Matsyura
14 May 2017
Biosystems Diversity | VOL. 25

鄱阳湖4种鹤类集群特征与成幼组成的时空变化
邵明勤 Shao Mingqin ... 卢萍 Lu Ping
Acta Ecologica Sinica | VOL. 37
邵明勤 Shao Mingqin, et. al.邵明勤 Shao Mingqin ... 卢萍 Lu Ping
01 Jan 2017
Acta Ecologica Sinica | VOL. 37

Landslide Susceptibility Evaluation and Management Using Different Machine Learning Methods in The Gallicash River Watershed, Iran
Alireza Arabameri ... Thomas Blaschke
Remote Sensing | VOL. 12
Alireza Arabameri, et. al.Alireza Arabameri ... Thomas Blaschke
03 Feb 2020
Remote Sensing | VOL. 12

Erratum to: Landslide susceptibility mapping using random forest, boosted regression tree, classification and regression tree, and general linear models and comparison of their performance at Wadi Tayyah Basin, Asir Region, Saudi Arabia
Ahmed Mohamed Youssef ... Mohamed M Al-Katheeri
Landslides | VOL. 13
Ahmed Mohamed Youssef, et. al.Ahmed Mohamed Youssef ... Mohamed M Al-Katheeri
30 Dec 2015
Landslides | VOL. 13

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Why choose Random Forest to predict rare species distribution with few samples in large undersampled areas? Three Asian crane species models provide supporting evidence.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PeerJ