Abstract

Construct a knowledge graph is time-consuming and the knowledge graph in the scientific domain requires extremely high labor costs due to it requires high prior knowledge to extract knowledge from resources. To build a scientific research knowledge graph, the most of input are papers, patent, the description of their project and some national program (such as National High Technology Research and Development Program of China, Major State Basic Research Development Program of China, General Program, Key Program and Major Program) which all of them are unstructured data, that make human participation are mostly necessary to measure the quality. In this paper, we design and proposed a framework using active learning; this framework can be used to extract entity and relation from unstructured science and technology research data. This framework combines the human and machine learning approach together, which is active learning, to help user extract entity from those unstructured data with less time cost. By using those data to construct a CKG as annotation label, it further implements active learning tools and helps the expert to rapidly annotate the data with high accuracy. Those knowledge graph constructed by this framework can be used to finding similar research area, finding similar researchers, finding popular research areas and so on.

Highlights

  • In the scientific domain, knowledge graph can be used in many ways

  • We design and proposed a framework using active learning; this framework can be used to extract entity and relation from unstructured science and technology research data. This framework combines the human and machine learning approach together, which is active learning, to help user extract entity from those unstructured data with less time cost. By using those data to construct a Concept Knowledge Graph (CKG) as annotation label, it further implements active learning tools and helps the expert to rapidly annotate the data with high accuracy

  • Those knowledge graph constructed by this framework can be used to finding similar research area, finding similar researchers, finding popular research areas and so on

Read more

Summary

Introduction

Knowledge graph can be used in many ways. For example, it can use to recognize deviant researchers who do not have enough research contribution. Knowledge graph collects a massive amount of interrelated facts that connect different concepts and instances, and can be transformed into practical knowledge (Pujara, Miao, Getoor, & Cohen, 2013). In the scientific research domain, IKG contains instance data such as the title of the papers, content of research projects and so on. These data sources for construction are generally unstructured data, in which the knowledge needs to be extracted manually with dramatic labor cost. To solve the labor cost problem, we implement active learning to reduce human participation workloads during the scientific unstructured data annotation process, and it’s further combined with “expert-in-the-loop” methodology to maintain the quality of entity annotation and relation extraction result.

Related Work
Framework and Workflow
The Framework for Extract Science and Technology Research Data
The Workflow
Toolsets and Modules
Human-in-the-Loop Active Learning Toolset
Named Entity Recognition and Relation Extraction Process
Quality Control
Findings
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call