Abstract

Scientific names serve to label biodiversity information: information related to species. Names, and their underlying taxonomic definitions, however, are unstable and ambiguous. This negatively impacts the utility of names as identifiers and as effective indexing tools in biological informatics where names are commonly utilized for searching, retrieving and integrating information about species. Semiotics provides a general model for describing the relationship between taxon names and taxon concepts. It distinguishes syntactics, which governs relationships among names, from semantics, which represents the relations between those labels and the taxa to which they refer. In the semiotic context, changes in semantics (i.e., taxonomic circumscription) do not consistently result in a corresponding and reflective change in syntax. Further, when syntactic changes do occur, they may be in response to semantic changes or in response to syntactic rules. This lack of consistency in the cardinal relationship between names and taxa places limits on how scientific names may be used in biological informatics in initially anchoring, and in the subsequent retrieval and integration, of relevant biodiversity information. Precision and recall are two measures of relevance. In biological taxonomy, recall is negatively impacted by changes or ambiguity in syntax while precision is negatively impacted when there are changes or ambiguity in semantics. Because changes in syntax are not correlated with changes in semantics, scientific names may be used, singly or conflated into synonymous sets, to improve recall in pattern recognition or search and retrieval. Names cannot be used, however, to improve precision. This is because changes in syntax do not uniquely identify changes in circumscription.These observations place limits on the utility of scientific names within biological informatics applications that rely on names as identifiers for taxa. Taxonomic systems and services used to organize and integrate information about taxa must accommodate the inherent semantic ambiguity of scientific names. The capture and articulation of circumscription differences (i.e., multiple taxon concepts) within such systems must be accompanied with distinct concept identifiers that can be employed in association with, or in replacement of, traditional scientific names.

Highlights

  • Scientific names are labels for taxa that are governed by formalized rules of nomenclature

  • Scientific names link most information related to a species but the relationship between nomenclatural syntax and taxonomic semantics is inherently ambiguous

  • Informatics processes that rely on data-gathering methods linked to taxon names are susceptible to this ambiguity and run the risk of providing imprecise or incomplete sets of data to subsequent downstream processes

Read more

Summary

Introduction

Scientific names are labels for taxa that are governed by formalized rules of nomenclature. Sherborn’s Index Animalium (IA) represents a monumental attempt to capture key data elements regarding the source and orthography of (nearly) all zoological names for species from the beginning of formalized Linnaean zoological nomenclature in 1758 through 1850. Much of the value and respect that IA has received is derived from the enormous amount of work required to compile and verify the names and associated publications. Biologists rely on this reference when they need to consult the original work (Evenhuis 2016). Speaking, scientific names form the basis for referring to species and they label biodiversity information across the entire spectrum of biodiversity knowledge (Thompson and Pape 2016). Without a name associated with an information or data object, the taxonomic link is effectively lost

Discussion
Result
Summary
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call