Abstract
Text similarity has a relatively wide range of applications in many fields, such as intelligent information retrieval, question answering system, text rechecking, machine translation, and so on. The text similarity computing based on the meaning has been used more widely in the similarity computing of the words and phrase. Using the knowledge structure of the and its method of knowledge description, taking into account the other factor and weight that influenced similarity, making full use of depth and density of the Concept-Sememe tree, an improved method of Chinese word similarity calculation based on semantic distance was provided in this paper. Finally the effectiveness of this method was verified by the simulation results.
Highlights
With the rapid development of social information, the requirement coming from the needs that people deal with a lot of information by computer becomes more and more, especially in text information processing
The text similarity computing based on the meaning has been used more widely in the similarity computing of the words and phrase
Using the knowledge structure of the and its method of knowledge description, taking into account the other factor and weight that influenced similarity, making full use of depth and density of the Concept-Sememe tree, an improved method of Chinese word similarity calculation based on semantic distance was provided in this paper
Summary
With the rapid development of social information, the requirement coming from the needs that people deal with a lot of information by computer becomes more and more, especially in text information processing. The quantitative methods based on the statistics can measure the semantic similarity between words precisely and effectively This method depends too much on the corpus train used, and it needs too much calculation and its methods are too hard. All the basic sememe form a sememe level system (as Figure 1) This sememe level system is a tree structure, and is the basis of our semantic similarity calculation. In consideration of all sememe formed a tree sememe level system based on the hyponymy, we adopt the method using semantic distance to calculate the sememe similarity. About notional word concept semantic express, we can divide it into four parts: a) The first independent sememe description: we consider the two concepts’ similarity about this part as Sim ( S1, S2 ). The similarity of main parts is lower and the effect that the similarity of the minor parts have on the entirety similarity will reduce
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.