Abstract

Many attempts have been made to construct new domain-specific knowledge graphs using the existing knowledge base of various domains. However, traditional “dictionary-based” or “supervised” knowledge graph building methods rely on predefined human-annotated resources of entities and their relationships. The cost of creating human-annotated resources is high in terms of both time and effort. This means that relying on human-annotated resources will not allow rapid adaptability in describing new knowledge when domain-specific information is added or updated very frequently, such as with the recent coronavirus disease-19 (COVID-19) pandemic situation. Therefore, in this study, we propose an Open Information Extraction (OpenIE) system based on unsupervised learning without a pre-built dataset. The proposed method obtains knowledge from a vast amount of text documents about COVID-19 rather than a general knowledge base and add this to the existing knowledge graph. First, we constructed a COVID-19 entity dictionary, and then we scraped a large text dataset related to COVID-19. Next, we constructed a COVID-19 perspective language model by fine-tuning the bidirectional encoder representations from transformer (BERT) pre-trained language model. Finally, we defined a new COVID-19-specific knowledge base by extracting connecting words between COVID-19 entities using the BERT self-attention weight from COVID-19 sentences. Experimental results demonstrated that the proposed Co-BERT model outperforms the original BERT in terms of mask prediction accuracy and metric for evaluation of translation with explicit ordering (METEOR) score.

Highlights

  • Due to the continuous spread of COVID-19, the demand for information from various coronavirus-related domains has increased significantly

  • In this study, we propose a deep learning-based COVID-19 relation extraction method for knowledge graph construction that does not originate from a traditional knowledge base

  • We introduce an unsupervised COVID-19-relation extraction method for updating knowledge graphs using the bidirectional encoder representations from transformer (BERT)

Read more

Summary

Introduction

Due to the continuous spread of COVID-19, the demand for information from various coronavirus-related domains has increased significantly. Such information is usually generated through various media, papers, and patents, and the amount of new information is increasing quickly. As information related to the coronavirus is accumulated continuously, several methods have been considered to manage this information effectively. These methods typically extract meaningful information from huge datasets and refine the information to applicable knowledge. Many attempts have been made to construct knowledge graphs (KGs) that represent knowledge as a directional network

Objectives
Methods
Findings
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call