Abstract

Constructing a knowledge graph of geological hazards literature can facilitate the reuse of geological hazards literature and provide a reference for geological hazard governance. Named entity recognition (NER), as a core technology for constructing a geological hazard knowledge graph, has to face the challenges that named entities in geological hazard literature are diverse in form, ambiguous in semantics, and uncertain in context. This can introduce difficulties in designing practical features during the NER classification. To address the above problem, this paper proposes a deep learning-based NER model; namely, the deep, multi-branch BiGRU-CRF model, which combines a multi-branch bidirectional gated recurrent unit (BiGRU) layer and a conditional random field (CRF) model. In an end-to-end and supervised process, the proposed model automatically learns and transforms features by a multi-branch bidirectional GRU layer and enhances the output with a CRF layer. Besides the deep, multi-branch BiGRU-CRF model, we also proposed a pattern-based corpus construction method to construct the corpus needed for the deep, multi-branch BiGRU-CRF model. Experimental results indicated the proposed deep, multi-branch BiGRU-CRF model outperformed state-of-the-art models. The proposed deep, multi-branch BiGRU-CRF model constructed a large-scale geological hazard literature knowledge graph containing 34,457 entities nodes and 84,561 relations.

Highlights

  • Knowledge graphs of geological hazards literature can facilitate the reuse of geological hazards literature and provide a reference for geological hazard mitigation

  • Considering named entities in geological hazard literature are ambiguous in semantics, we propose a multi-branch structure to extract different levels of semantic information, and use the attention mechanism [1] and residual structure [2] to enhance the feature from each branch of different depths

  • We propose a geological hazard Named entity recognition (NER) model based on the deep learning method; namely, the deep, multi-branch bidirectional gated recurrent unit (BiGRU)-conditional random field (CRF) model, to extract geological hazard named entities and construct a knowledge graph

Read more

Summary

Introduction

Knowledge graphs of geological hazards literature can facilitate the reuse of geological hazards literature and provide a reference for geological hazard mitigation. There is significant literature related to geological hazard research on the Wanfang academic platform (Wanfang database), and it is difficult for researchers to read all of these articles to find the information they need. Using machine learning methods to recognize the named entities from the geological hazard related literature and constructing a knowledge graph can greatly enhance the reuse of literature, and increase efficiency and convenience in the research and governance of geological hazards. Named entity recognition (NER) is a technology to classify mentions of entities in unstructured text into pre-defined categories. Named entities in geological hazard literature are diverse in form, ambiguous in semantics, and uncertain in context. Named entities in geological hazard literature have diverse forms. Named entities in geological hazard literature have ambiguous semantics

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call