Fossil identification is an essential and fundamental task for conducting palaeontological research. Because the manual identification of fossils requires extensive experience and is time-consuming, automatic identification methods are proposed. However, these studies are limited to a few or dozens of species, which is hardly adequate for the needs of research. This study enabled the automatic identification of hundreds of species based on a newly established fossil dataset. An available "bivalve and brachiopod fossil image dataset" (BBFID, containing >16,000 "image-label" data pairs, taxonomic determination completed) was created. The bivalves and brachiopods contained in BBFID are closely related in morphology, ecology and evolution that have long attracted the interest of researchers. We achieved >80% identification accuracy at 22 genera and ∼64% accuracy at 343 species using EfficientNetV2s architecture. The intermediate output of the model was extracted and downscaled to obtain the morphological feature space of fossils using t-distributed stochastic neighbor embedding (t-SNE). We found a distinctive boundary between the morphological feature points of bivalves and brachiopods in fossil morphological feature distribution maps. This study provides a possible method for studying the morphological evolution of fossil clades using computer vision in the future.
Read full abstract