Abstract
ABSTRACT Handling the data sparseness question is a main way to further enhance the system performances of Head-driven statistical parsing models. Two smoothing methods are proposed to mitigate remaining data-sparseness problems. The first smoothing method is that two word classification algorithms based on word similarity have been developed, which employ the mutual information of two words that are adjoining words or have semantic relationship to define word similarity and word-class similarity. The second smoothing method is to decompose the generation of each internal rule into a sequence of smaller steps, and then to make conditional independence assumptions to incorporate the Part-Of-Speech tags of adjoining words or adjoining phrase tags into the probability computation of the context-free rules, the incorporating additional context information into the syntactic parsing models is very useful for improving the system performances of syntactic parsing. The two category-based statistical analysis models are tested through experiments. The improved parsing model 2 has far better system performances than head-driven parsing model: recall reaches 87.89%, accuracy reaches 88.62, and F-measure is enhanced 8.10% compared with the head-driven analysis method.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.