Abstract

With the rapid development of knowledge graph related technologies, domain knowledge graph has become a research hotspot in academia and industry. However, the domain knowledge graph for technical documents is not mature enough, and the semantic information implicit in unstructured technical documents has not been fully tapped. Combining the characteristics of technical documents, the paper proposes a TextCNN-based topic information extraction model and constructs a domain knowledge graph for technical documents. It uses the graph database Neo4j for knowledge storage and visualization. The information extraction model based on TextCNN can automatically extract the subject information of the document and the summary information such as title, ID, status, meeting, organization, etc. Experiments show that the model has high accuracy on the technical document dataset, which can effectively reduce the cost of manual annotation and data collation. At the same time, knowledge graph visualization can facilitate scientific researchers to search, track and update technical documents, which can show the evolution of technology more clearly.

Highlights

  • Knowledge is the cornerstone of cognitive intelligence, making it possible to explain artificial intelligence

  • This paper combines the characteristics of technical proposal documents to process unstructured proposal document data, and builds an information extraction model based on TextCNN

  • The model can realize the automatic extraction of summary information such as subject, key technology, title, proposal status, document source, and agenda item of technical proposal documents, which can effectively reduce the cost of manual annotation and data collation

Read more

Summary

INTRODUCTION

Knowledge is the cornerstone of cognitive intelligence, making it possible to explain artificial intelligence. It is urgent to build a domain knowledge graph based on the technical proposal documents. This paper combines the characteristics of technical proposal documents to process unstructured proposal document data, and builds an information extraction model based on TextCNN. A domain knowledge graph for technical documents is designed and Neo4j is selected as the graph storage tool for visualization, which is convenient for researchers to optimize queries and update proposals. The constructed academic knowledge graph can provide accurate information resource retrieval and query recommendation for scholars and researchers, which can more clearly and intuitively show the association of technical proposals and the evolution of key technologies.

RELATED WORK
CONSTRUCTION OF DOMAIN KNOWLEDGE GRAPH FOR TECHNICAL DOCUMENTS
CONCLUSION
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.