Improvement of TF-IDF Algorithm Based on Knowledge Graph

Yanpeng Wang,Ye Yuan,Qing Liu,Dehai Zhang,Yun Yang

doi:10.1109/sera.2018.8477196

Abstract

The TF-IDF algorithm is commonly used for text information retrieval and data mining. The traditional TF-IDF algorithm does not consider the domain characteristics of the article, and does not consider the distribution ratio. Currently, the solution proposed by many scholars only solves the problems of distribution ratio and the like, and does not solve the problem that the domain keywords have unreasonable weights. The problem has led to the use of domain-specific applications where relevant keywords in some areas have not been given appropriate weights. This paper proposes an improved method based on domain knowledge graph. This method will mainly consider the application of the legal field, and use the legal knowledge graph to make improvements to the TF-IDF algorithm, so as to achieve the reasonable weight assigned to the domain-related keywords in text feature extraction. Experiments show that this method can effectively improving the accuracy of the extraction.

Full Text