Abstract
AbstractPatterns have been extensively used to extract hypernym relations from texts. The most popular patterns are Hearst’s patterns, formulated as regular expressions mainly based on lexical information. Experiences have reported good precision and low recall for such patterns. Thus, several approaches have been developed for improving recall. While these approaches perform better in terms of recall, it remains quite difficult to further increase recall without degrading precision. In this paper, we propose a novel 3-phase approach based on sequential pattern mining to improve pattern-based approaches in terms of both precision and recall by (i) using a rich pattern representation based on grammatical dependencies (ii) discovering new hypernym patterns, and (iii) extending hypernym patterns with anti-hypernym patterns to prune wrong extracted hypernym relations. The results obtained by performing experiments on three corpora confirm that using our approach, we are able to learn sequential patterns and combine them to outperform existing hypernym patterns in terms of precision and recall. The comparison to unsupervised distributional baselines for hypernym detection shows that, as expected, our approach yields much better performance. When compared to supervised distributional baselines for hypernym detection, our approach can be shown to be complementary and much less loosely coupled with training datasets and corpora.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.