Abstract

Word embedding models have recently shown some capability to encode hierarchical information that exists in textual data. However, such models do not explicitly encode the hierarchical structure that exists among words. In this work, we propose a method to learn hierarchical word embeddings (HWEs) in a specific order to encode the hierarchical information of a knowledge base (KB) in a vector space. To learn the word embeddings, our proposed method considers not only the hypernym relations that exist between words in a KB but also contextual information in a text corpus. The experimental results on various applications, such as supervised and unsupervised hypernymy detection, graded lexical entailment prediction, hierarchical path prediction, and word reconstruction tasks, show the ability of the proposed method to encode the hierarchy. Moreover, the proposed method outperforms previously proposed methods for learning nonspecialised, hypernym-specific, and hierarchical word embeddings on multiple benchmarks.

Highlights

  • IntroductionOrganising the meanings of concepts in the form of hierarchy is a standard practice ubiquitous in many fields including medicine (http://www.snomed.org/), biology (https:// www.bbc.co.uk/ontologies/wo), and linguistics (https:// wordnet.princeton.edu/)

  • Organising the meanings of concepts in the form of hierarchy is a standard practice ubiquitous in many fields including medicine, biology, and linguistics

  • We evaluate the learnt hierarchical word embeddings (HWEs) on four main tasks: a standard supervised and unsupervised hypernym detection tasks, and a newly-proposed hierarchical path prediction and word reconstruction tasks

Read more

Summary

Introduction

Organising the meanings of concepts in the form of hierarchy is a standard practice ubiquitous in many fields including medicine (http://www.snomed.org/), biology (https:// www.bbc.co.uk/ontologies/wo), and linguistics (https:// wordnet.princeton.edu/). Given a training corpus and a KB (we refer to as a taxonomy in this paper), we learn word embeddings that simultaneously encode the hierarchical path structure in the taxonomy as well as the cooccurrence statistics between pairs of words in the corpus. Matching the pattern X such as Y on the sentence “some birds recorded in Africa such as Gadwall” will incorrectly detect (Gadwall, Africa) as having a hypernymic relation Such noise in corpus-based approaches can be reduced by guiding the learning process using a taxonomy. The learned HWEs can be used to assign novel words to the paths in a given taxonomy (Subsection 4.4) This is useful when the taxonomy is incomplete because we can expand the taxonomy using the information available in the corpus. This provides an explicit interpretation of the word semantics, otherwise, implicitly embedded in a lower-dimensional vector space

Related Work
Hierarchical Word Embeddings
Experiments and Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.