Abstract

Text similarity has a relatively wide range of applications in many fields, such as intelligent information retrieval, question answering system, text rechecking, machine translation, and so on. The text similarity computing based on the meaning has been used more widely in the similarity computing of the words and phrase. Using the knowledge structure of the and its method of knowledge description, taking into account the other factor and weight that influenced similarity, making full use of depth and density of the Concept-Sememe tree, an improved method of Chinese word similarity calculation based on semantic distance was provided in this paper. Finally the effectiveness of this method was verified by the simulation results.

Highlights

  • With the rapid development of social information, the requirement coming from the needs that people deal with a lot of information by computer becomes more and more, especially in text information processing

  • The text similarity computing based on the meaning has been used more widely in the similarity computing of the words and phrase

  • Using the knowledge structure of the and its method of knowledge description, taking into account the other factor and weight that influenced similarity, making full use of depth and density of the Concept-Sememe tree, an improved method of Chinese word similarity calculation based on semantic distance was provided in this paper

Read more

Summary

Introduction

With the rapid development of social information, the requirement coming from the needs that people deal with a lot of information by computer becomes more and more, especially in text information processing. The quantitative methods based on the statistics can measure the semantic similarity between words precisely and effectively This method depends too much on the corpus train used, and it needs too much calculation and its methods are too hard. All the basic sememe form a sememe level system (as Figure 1) This sememe level system is a tree structure, and is the basis of our semantic similarity calculation. In consideration of all sememe formed a tree sememe level system based on the hyponymy, we adopt the method using semantic distance to calculate the sememe similarity. About notional word concept semantic express, we can divide it into four parts: a) The first independent sememe description: we consider the two concepts’ similarity about this part as Sim ( S1, S2 ). The similarity of main parts is lower and the effect that the similarity of the minor parts have on the entirety similarity will reduce

The Similarity Calculation of the Words Fusing Multiple Information
Conclusions
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call