Biodiversity Observations Miner: A web application to unlock primary biodiversity data from published literature

Gabriel Muñoz,W Daniel Kissling,E Emiel Van Loon

doi:10.3897/bdj.7.e28737

Abstract

BackgroundA considerable portion of primary biodiversity data is digitally locked inside published literature which is often stored as pdf files. Large-scale approaches to biodiversity science could benefit from retrieving this information and making it digitally accessible and machine-readable. Nonetheless, the amount and diversity of digitally published literature pose many challenges for knowledge discovery and retrieval. Text mining has been extensively used for data discovery tasks in large quantities of documents. However, text mining approaches for knowledge discovery and retrieval have been limited in biodiversity science compared to other disciplines.New informationHere, we present a novel, open source text mining tool, the Biodiversity Observations Miner (BOM). This web application, written in R, allows the semi-automated discovery of punctual biodiversity observations (e.g. biotic interactions, functional or behavioural traits and natural history descriptions) associated with the scientific names present inside a corpus of scientific literature. Furthermore, BOM enable users the rapid screening of large quantities of literature based on word co-occurrences that match custom biodiversity dictionaries. This tool aims to increase the digital mobilisation of primary biodiversity data and is freely accessible via GitHub or through a web server.

Highlights

Mobilisation, digitalization and interoperability of data on biodiversity are vital for sharing our global knowledge of nature (Wilkinson et al 2016, Kissling et al 2015, Edwards 2000)
The need for digitally available biodiversity data has resulted in the development of global cyber-infrastructures such as the Global Biodiversity Information Facility (GBIF: www.gbif.org) (Edwards 2001), the Plant Trait Database (TRY: www.try-db.org) (Kattge et al 2011), the Data Observation Network for Earth (DataOne: www.dataone.org) (Michener et al 2011) and Global Biotic Interactions (GloBi: www.globalbioticinteractions.org) (Poelen et al 2014)
We present the Biodiversity Observations Miner (BOM), a text mining tool that has been designed to augment the ability of ecologists and biodiversity scientists to implement text mining frameworks into their data compilation workflows

Summary

Introduction

Mobilisation, digitalization and interoperability of data on biodiversity are vital for sharing our global knowledge of nature (Wilkinson et al 2016, Kissling et al 2015, Edwards 2000). The need for digitally available biodiversity data has resulted in the development of global cyber-infrastructures such as the Global Biodiversity Information Facility (GBIF: www.gbif.org) (Edwards 2001), the Plant Trait Database (TRY: www.try-db.org) (Kattge et al 2011), the Data Observation Network for Earth (DataOne: www.dataone.org) (Michener et al 2011) and Global Biotic Interactions (GloBi: www.globalbioticinteractions.org) (Poelen et al 2014). A considerable amount of biodiversity data is still locked inside the current corpus of published literature (Nguyen et al 2017) This pool of biodiversity data is often stored and shared as PDF files which limits its interoperability.

Objectives

Methods

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Biodiversity Data Journal	Publication Date: Jan 16, 2019
Citations: 5	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Biodiversity Observations Miner: A web application to unlock primary biodiversity data from published literature

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Biodiversity Data Journal

Lead the way for us

Similar Papers

Beyond dead trees: integrating the scientific process in the Biodiversity Data Journal.
Vincent Smith ... Pavel Stoev
Biodiversity Data Journal | VOL. 1
Vincent Smith, et. al.Vincent Smith ... Pavel Stoev
16 Sep 2013
Biodiversity Data Journal | VOL. 1

The Open Biodiversity Knowledge Management (eco-)System: Tools and Services for Extraction, Mobilization, Handling and Re-use of Data from the Published Literature
Lyubomir Penev ... Guido Sautter
Biodiversity Information Science and Standards | VOL. 2
Lyubomir Penev, et. al.Lyubomir Penev ... Guido Sautter
17 May 2018
Biodiversity Information Science and Standards | VOL. 2

Web Services on Rails: Using Ruby and Rails for Web Services Development and Mashups
E Michael Maximilien
-
E Michael MaximilienE Michael Maximilien
01 Sep 2006
01 Sep 2006

Reviewing taxonomic bias in a megadiverse country: primary biodiversity data, cultural salience, and scientific interest of South African animals
Fortunate M Phaka ... Jean Hugé
Environmental Reviews | VOL. 30
Fortunate M Phaka, et. al.Fortunate M Phaka ... Jean Hugé
11 Jan 2022
Environmental Reviews | VOL. 30

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Biodiversity Observations Miner: A web application to unlock primary biodiversity data from published literature

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Biodiversity Data Journal