Information extraction and knowledge graph construction from geoscience literature

Chengbin Wang,Xiaogang Ma,Jianguo Chen,Jingwen Chen

doi:10.1016/j.cageo.2017.12.007

Chengbin Wang, Xiaogang Ma + Show 2 more

Open Access

https://doi.org/10.1016/j.cageo.2017.12.007

Copy DOI

Abstract

Geoscience literature published online is an important part of open data, and brings both challenges and opportunities for data analysis. Compared with studies of numerical geoscience data, there are limited works on information extraction and knowledge discovery from textual geoscience data. This paper presents a workflow and a few empirical case studies for that topic, with a focus on documents written in Chinese. First, we set up a hybrid corpus combining the generic and geology terms from geology dictionaries to train Chinese word segmentation rules of the Conditional Random Fields model. Second, we used the word segmentation rules to parse documents into individual words, and removed the stop-words from the segmentation results to get a corpus constituted of content-words. Third, we used a statistical method to analyze the semantic links between content-words, and we selected the chord and bigram graphs to visualize the content-words and their links as nodes and edges in a knowledge graph, respectively. The resulting graph presents a clear overview of key information in an unstructured document. This study proves the usefulness of the designed workflow, and shows the potential of leveraging natural language processing and knowledge graph technologies for geoscience.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Computers & Geosciences	Publication Date: Dec 16, 2017
Citations: 152	License type: publisher-specific-oa

R Discovery Prime

R Discovery Prime

Information extraction and knowledge graph construction from geoscience literature

Abstract

Talk to us

Similar Papers

More From: Computers & Geosciences

Lead the way for us

Similar Papers

Correlation between the Dissemination of Classic English Literary Works and Cultural Cognition in the New Media Era
Weiwei Guo
Advances in Multimedia | VOL. 2022
Weiwei GuoWeiwei Guo
20 Jul 2022
Advances in Multimedia | VOL. 2022

The Construction of Knowledge Graph in Reservoir Geology and Its Application in Identifying Hydrocarbon Pay Zone
Xiangguang Zhou ... Xinxi Fu
-
Xiangguang Zhou, et. al.Xiangguang Zhou ... Xinxi Fu
01 Jan 2021
01 Jan 2021

Utilizing Large Language Models for Geoscience Literature Information Extraction
Peng Yu ... Cheng Deng
-
Peng Yu, et. al.Peng Yu ... Cheng Deng
09 Mar 2024
09 Mar 2024

Leveraging Knowledge Graphs and Natural Language Processing for Automated Web Resource Labeling and Knowledge Mobilization in Neurodevelopmental Disorders: Development and Usability Study.
Jeremy Costello ... Marek Z Reformat
Journal of Medical Internet Research | VOL. 25
Jeremy Costello, et. al.Jeremy Costello ... Marek Z Reformat
17 Apr 2023
Journal of Medical Internet Research | VOL. 25

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Information extraction and knowledge graph construction from geoscience literature

Abstract

Talk to us

Similar Papers

More From: Computers &amp; Geosciences

More From: Computers & Geosciences