Abstract

Faunal analyses depend on accurate taxonomic identifications, but distinguishing between morphologically similar fauna is not always possible through visual comparisons with comparative reference skeletons. Previous research has addressed this limitation through morphometric modeling using Linear Discriminant Function Analysis (LDA) or Principal Component Analysis (PCA). However, both approaches are limited by their assumptions and their ability to estimate error, constraining their empirical use for identifying faunal specimens. Random Forest (RF), a machine learning method, can resolve these limitations. Here, we evaluate the predictive power of LDA, PCA, and RF for taxonomic identification using morphometric modeling to determine which approach is best suited for faunal analyses. We use cranial specimens of modern Dipodomys spp. (kangaroo rat) and Leporidae (rabbit and hare) species to simulate complete datasets and datasets with missing measurement variables. We use these datasets to estimate species identification error rates and assess how well each statistical approach establishes species-level identifications under different conditions. Results indicate that RF outperforms LDA and PCA. RF more accurately predicts species identification with a complete dataset and when missing measurement data are interpolated. Next, using faunal material from Abrigo de los Escorpiones, a trans-Holocene site in Baja California, we demonstrate the use of RF for species identification and highlight that LDA, PCA, and RF all produce significantly different species identifications of the faunal material, emphasizing the need to validate statistical models used for taxonomic identification. Ultimately, this study highlights RF's predictive power and utility for faunal analysis, making it an important tool for zooarchaeological and paleoecological research.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call