Abstract

In this paper it presents a new dependency parsing tree (DPT) generation algorithm. Different from other similar algorithms, which based on statistical probability model, the algorithm converts the dependency parsing tree generation problem into a semantic segments dividing problem. In this paper, the co-occurrence frequency of words is firstly analyzed, and it is pointed out that the co-occurrence frequency of words can be used as the basis for the judgment of semantic dependence relationship between words. Then it further analyzes the change of co-occurrence frequency entropy of words in a semantic unit (sentence is used as the basic semantic unit in this paper). And we present an algorithm to divide a sentence into semantic fragments in which words has tight semantic relationship with each other. Based on the above work, this paper divides the DPT generation algorithm into three steps. The first step is to divide the sentence into semantic fragments. The second step is to distinguish semantic core word and non-semantic core words according to the semantic dependency relationship between words in a semantic fragment. Then in the last step the DPT is generated according semantic dependency relationship between semantic core words. Based on court documents which collected from web, the experiments of our DPT generation algorithm are conducted in this paper. And the results show that the DPT generation algorithm in this paper maintains a high degree of consistency with the DPT tree generated by human.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.