Abstract

As complicated microcosms, lakes have been attracting exponentially increasing attention, resulting in plentiful interdisciplinary academic publications of more than 105. It is thereby challengeable to explore the massive unstructured text information of publications to understand lake topics from the global- and centurial-scale perspectives. However, conventional bibliometrics suffer from the limitations of non-understanding the literature. A novel approach, Natural Language Understanding-Based Deep Clustering (NLU-DC) for large text clustering, was proposed in this study for global meta-analysis of evolution patterns for lake topics. The validated NLU-DC elevated the available keywords from 24% to 70%, correcting the statistical bias in the traditional evidence synthesis. Its high performance derives from the integration of a deep learning model, cosine distance, DBSCAN clustering and changing hyperparameters. This approach is of great accuracy and efficiency for large text datasets. This study thereby identified the centurial-scale topics related to lakes using large literature datasets covering >130,000 studies. The results showed that the topics became more and more abundant but were concentrated stably towards central ones. Six evolution patterns, consisting of fluctuating, emerging in 1970, emerging after 1970, trending-up, stable and trending-down patterns, were identified with generalized linear model (GLM). We found that, in recent twenty years, few emerging topics attract significant academic attention; and the dependency between topics is catching more attention than before. To prolong the lake studies, it is essential to strengthen the integrated studies on multi-pattern topics, in particular, over emerging and trending-down topics. Our study verified that the NLU-DC, consisting of state-of-the-art deep learning models in natural language processing and efficient clustering algorithm in machine learning, is a powerful method for global meta-analysis of water-related research fields and have huge potential to be applied in all fields of academical studies.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call