Abstract

Species identification—of importance for most biological disciplines—is not always straightforward as cryptic species hamper traditional identification. Fibre-optic near-infrared spectroscopy (NIRS) is a rapid and inexpensive method of use in various applications, including the identification of species. Despite its efficiency, NIRS has never been tested on a group of more than two cryptic species, and a working routine is still missing. Hence, we tested if the four morphologically highly similar, but genetically distinct ant species Tetramorium alpestre, T. caespitum, T. impurum, and T. sp. B, all four co-occurring above 1,300 m above sea level in the Alps, can be identified unambiguously using NIRS. Furthermore, we evaluated which of our implementations of the three analysis approaches, partial least squares regression (PLS), artificial neural networks (ANN), and random forests (RF), is most efficient in species identification with our data set. We opted for a 100% classification certainty, i.e., a residual risk of misidentification of zero within the available data, at the cost of excluding specimens from identification. Additionally, we examined which strategy among our implementations, one-vs-all, i.e., one species compared with the pooled set of the remaining species, or binary-decision strategies, worked best with our data to reduce a multi-class system to a two-class system, as is necessary for PLS. Our NIRS identification routine, based on a 100% identification certainty, was successful with up to 66.7% of unambiguously identified specimens of a species. In detail, PLS scored best over all species (36.7% of specimens), while RF was much less effective (10.0%) and ANN failed completely (0.0%) with our data and our implementations of the analyses. Moreover, we showed that the one-vs-all strategy is the only acceptable option to reduce multi-class systems because of a minimum expenditure of time. We emphasise our classification routine using fibre-optic NIRS in combination with PLS and the one-vs-all strategy as a highly efficient pre-screening identification method for cryptic ant species and possibly beyond.

Highlights

  • Correct species identification is crucial for most fields of biology, including biodiversity research, conservation biology, invasion biology, and the understanding of evolution (Bickford et al, 2007; Pfenninger & Schwenk, 2007)

  • The Principal component analysis (PCA) plot showed no distinct clustering of the spectral data according to species (Fig. 5)

  • For the four-class system, as used in this study, the estimated elaboration times were 5.3 h for one-vs-all, 29.3 h for binary-decision type A, and 12 h for binary-decision type B. These differences increased with increasing number of classes, e.g., for a seven-class system as represented by all Central European species of the Tetramorium caespitum/impurum complex, one-vs-all would take 9.3 h, binary-decision type A 560.0 h, and binary-decision type B 354.7 h

Read more

Summary

Introduction

Correct species identification is crucial for most fields of biology, including biodiversity research, conservation biology, invasion biology, and the understanding of evolution (Bickford et al, 2007; Pfenninger & Schwenk, 2007). Cryptic species are known from all biogeographical regions and from all major metazoan taxa (Pfenninger & Schwenk, 2007). Complexes of cryptic species, i.e., more than two species not differentiable, are not a rarity in insects (Hebert et al, 2004; Smith et al, 2008; Seifert, 2009), in other arthropods (Wilcox et al, 1997; Arthofer et al, 2013), and even in vertebrates (Oliver et al, 2009). One major problem for the in-depth investigation of cryptic species is the high effort needed for correct species identification

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call