Phylogeny Based Biodiversity Data Queries

Scott Chamberlain

doi:10.3897/biss.2.25589

Abstract

There is a large amount of publicly available biodiversity data from many different data sources. When doing research, one ideally interacts with biodiversity data programmatically so their work is reproducible. The entry point to biodiversity data records is largely through taxonomic names, or common names in some cases (e.g., birds). However, many researchers have a phylogeny focused project, meaning taxonomic names are not the ideal interface to biodiversity data. Ideally, it would be simple to programmatically go from a phylogeny to biodiversity records through a phylogeny based query. I'll discuss a new project `phylodiv` (https://github.com/ropensci/phylodiv/) that attempts to facilitate phylogeny based biodiversity data collection (see Fig. 1). The project takes the form of an R software package. The idea is to make the user interface take essentially two inputs: a phylogeny and a phylogeny based question. Behind the scenes we'll do many things, including gathering taxonomic names and hierarchies for the taxa in the phylogeny, send queries to GBIF (or other data sources), and map the results. The user will of course have control over the behind the scenes parts, but I imagine the majority use case will be to input a phylogeny and a question and expect an answer back. We already have R tools to do nearly all parts of the work-flow shown above: there's a large number of phylogeny tools, `taxize`/`taxizedb` can handle taxonomic name collection, while `rgbif` can handle interaction with GBIF, and there's many mapping options in R. There are a few areas that need work still however. First, there's not yet a clear way to do a phylogeny based query. Ideally a user will be able to express a simple query like "taxon A vs. its sister group". That's simple to imagine, but to implement that in software is another thing. Second, users ideally would like answers back - in this case a map of occurrences - relatively quickly to be able to iterate on their research work-flow. The most likely solution to this will be to use GBIF's map tile service to visualize binned occurrence data, but we'll need to explore this in detail to make sure it works.

Highlights

There is a large amount of publicly available biodiversity data from many different data sources
The user will have control over the behind the scenes parts, but I imagine the majority use case will be to input a phylogeny and a question and expect an answer back
We already have R tools to do most parts of the work-flow shown above: there's a large number of phylogeny tools, 'taxize'/'taxizedb' can handle taxonomic name collection, while 'rgbif' can handle interaction with GBIF, and there's many mapping options in R

Summary

Introduction

There is a large amount of publicly available biodiversity data from many different data sources. Corresponding author: Scott A Chamberlain (myrmecocystus@gmail.com) Received: 06 Apr 2018 | Published: 21 May 2018 Citation: Chamberlain S (2018) Phylogeny Based Biodiversity Data Queries. Biodiversity Information Science and Standards 2: e25589. One ideally interacts with biodiversity data programmatically so their work is reproducible.

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Phylogeny Based Biodiversity Data Queries

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Biodiversity Information Science and Standards

Lead the way for us

Journal: Biodiversity Information Science and Standards	Publication Date: May 21, 2018
License type: CC BY 4.0

Similar Papers

Grow a Backbone! Introducing Nomos as a New Taxonomic Backbone for Western Australia
Cassia Piper ... Amanda Barker
Biodiversity Information Science and Standards | VOL. 7
Cassia Piper, et. al.Cassia Piper ... Amanda Barker
18 Aug 2023
Biodiversity Information Science and Standards | VOL. 7

The Biodiversity Knowledge Hub (BKH): A Crosspoint and Knowledge Broker for FAIR and Linked Biodiversity Data
Lyubomir Penev ...
Biodiversity Information Science and Standards | VOL. 7
Lyubomir Penev, et. al.Lyubomir Penev ...
24 Aug 2023
Biodiversity Information Science and Standards | VOL. 7

From Raw Biodiversity Data to Indicators, Boosting Products Creation, Integration and Dissemination: French BON FAIR initiatives and related informatics solutions
Yvan Le Bras ... Jean-Baptiste Mihoub
Biodiversity Information Science and Standards | VOL. 3
Yvan Le Bras, et. al.Yvan Le Bras ... Jean-Baptiste Mihoub
20 Aug 2019
Biodiversity Information Science and Standards | VOL. 3

Challenges in Curating Interdisciplinary Data in the Biodiversity Research Community
Inna Kouper ... Kimberly Cook
Biodiversity Information Science and Standards | VOL. 5
Inna Kouper, et. al.Inna Kouper ... Kimberly Cook
08 Dec 2021
Biodiversity Information Science and Standards | VOL. 5

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Phylogeny Based Biodiversity Data Queries

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Biodiversity Information Science and Standards