How is People's Awareness of "Biodiversity" Measured? Using Sentiment Analysis and LDA Topic Modeling in the Twitter Discourse Space from 2010 to 2020.

Shimon Ohtani

doi:10.1007/s42979-022-01276-w

Abstract

The importance of biodiversity conservation is gradually being recognized worldwide, and 2020 was the final year of the Aichi Biodiversity Targets formulated at the 10th Conference of the Parties to the Convention on Biological Diversity (COP10) in 2010. Unfortunately, the majority of the targets were assessed as unachievable. While it is essential to measure public awareness of biodiversity when setting the post-2020 targets, it is also a difficult task to propose a method to do so. This study provides a diachronic exploration of the discourse on “biodiversity” from 2010 to 2020, using Twitter posts, combined with sentiment analysis and topic modeling, commonly used in data science. Through the aggregation and comparison of n-grams, the visualization of eight types of emotional tendencies using the NRC emotion lexicon and supplemental comparison with the machine learning model, the construction of topic models using Latent Dirichlet allocation (LDA), and the qualitative analysis of tweet texts based on these models, the analysis and classification of these unstructured tweets have been performed effectively. The results revealed the evolution of words used with “biodiversity” on Twitter over the past decade, the emotional tendencies behind the contexts in which “biodiversity” has been used, and the approximate content of tweet texts that have constituted topics with distinctive characteristics. While searching for people’s awareness through SNS analysis still has many limitations, it is undeniable that essential suggestions can be obtained. To further refine the research method, it will be crucial to improve analysts’ skills, accumulate research examples, and advance data science.Supplementary InformationThe online version contains supplementary material available at 10.1007/s42979-022-01276-w.

Full Text