Automated retrieval of information on threatened species from online sources using machine learning

Ritwik Kulkarni,Enrico Di Minin

doi:10.1111/2041-210x.13608

Ritwik Kulkarni, Enrico Di Minin

Open Access

https://doi.org/10.1111/2041-210x.13608

Copy DOI

Abstract

Abstract As resources for conservation are limited, gathering and analysing information from digital platforms can help investigate the global biodiversity crisis in a cost‐efficient manner. Development and application of methods for automated content analysis of digital data sources are especially important in the context of investigating human–nature interactions. In this study, we introduce novel application methods to automatically collect and analyse textual data on species of conservation concern from digital platforms. An end‐to‐end pipeline is constructed that begins from searching and downloading news articles about species listed in Appendix I of the Convention on International Trade in Endangered Species of Wild Fauna and Flora (CITES) along with news articles from specific Twitter handles and proceeds with implementing natural language processing and machine learning methods to filter and retain only relevant articles. A crucial aspect here is the automatic annotation of training data, which can be challenging in many machine learning applications. A Named Entity Recognition model is then used to extract additional relevant information for each article. The data collected over a 1‐month period included 15,088 articles focusing on 585 species listed in Appendix I of CITES. The accuracy of the neural network to detect relevant articles was 95.91% while the Named Entity recognition model helped extract information on prices, location and quantities of traded animals and plants. A regularly updated database, which can be queried and analysed for various research purposes and to inform conservation decision making, is generated by the system. The results demonstrate that natural language processing can be used successfully to extract information from digital text content. The proposed methods can be applied to multiple digital data platforms at the same time and used to investigate human–nature interactions in conservation science and practice.

Highlights

Global biodiversity loss is one of the great sustainability challenges our society is facing (Butchart et al, 2010)
This study demonstrates the potential of using natural language processing and machine learning to identify relevant articles focusing on species of conservation concern and extract relevant information that can be used for further analyses
While in this study we focus on CITES Appendix I listed species and mine data on these species from online news and Twitter, the proposed methods allow for automated content analysis from multiple digital platforms at the same time and for inclusion of a larger number of species globally

Summary

Introduction

Global biodiversity loss is one of the great sustainability challenges our society is facing (Butchart et al, 2010). In the Information Age, digital data can be leveraged to help address the global biodiversity crisis and study how humans interact with nature (Di Minin et al, 2015; Ladle et al, 2016). Methods for automated content analysis of this deluge of digital data are needed (Di Minin et al, 2019; Lamba et al, 2019; Toivonen et al, 2019). Conservation culturomics is the field of conservation science where digital data sources and methods are being leveraged to help address the global biodiversity crisis and study human–nature interactions (Correia et al, 2021). Digital data sources have the potential to provide information on human–nature interactions at fine spatial and temporal scales (Di Minin et al, 2015). Attention should be paid to ensure responsible use of these data in accordance with data privacy requirements (Di Minin et al, 2021)

Objectives

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Methods in Ecology and Evolution	Publication Date: May 6, 2021
Citations: 19	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Automated retrieval of information on threatened species from online sources using machine learning

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Methods in Ecology and Evolution

Lead the way for us

Similar Papers

Chapter 19 - Addressing trade threats to pangolins in the Convention on International Trade in Endangered Species of Wild Fauna and Flora (CITES)
Daniel W.S Challender ... Colman O’Criodain
Pangolins | VOL. -
Daniel W.S Challender, et. al.Daniel W.S Challender ... Colman O’Criodain
29 Nov 2019
Pangolins | VOL. -

International trade in endangered species: the challenges and successes of the 17th conference of parties to the convention on international trade in endangered species of wild fauna and flora (CITES)
Aurélie Flore Koumba Pambo ... Abba Sonko
African Journal of Ecology | VOL. 54
Aurélie Flore Koumba Pambo, et. al.Aurélie Flore Koumba Pambo ... Abba Sonko
19 Nov 2016
African Journal of Ecology | VOL. 54

Communities beyond geographical limitation: The network characteristics of international wildlife trade under the pandemic
Jing Wang ... Xiuxiang Meng
Global Ecology and Conservation | VOL. 53
Jing Wang, et. al.Jing Wang ... Xiuxiang Meng
06 Jun 2024
Global Ecology and Conservation | VOL. 53

Perlindungan Harimau Sumatera Menurut Convention On International Trade In Endangered Species Of Wild Fauna And Flora (Cites) 1963
Muhammad Zidhan L Mainuru ... Veriana Josepha Batseba Rehatta
TATOHI: Jurnal Ilmu Hukum | VOL. 3
Muhammad Zidhan L Mainuru, et. al.Muhammad Zidhan L Mainuru ... Veriana Josepha Batseba Rehatta
31 Jan 2024
TATOHI: Jurnal Ilmu Hukum | VOL. 3

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Automated retrieval of information on threatened species from online sources using machine learning

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Methods in Ecology and Evolution