Abstract
Pre-trained embeddings such as word embeddings and sentence embeddings are fundamental tools facilitating a wide range of downstream NLP tasks. In this work, we investigate how to learn a general-purpose embedding of textual relations, defined as the shortest dependency path between entities. Textual relation embedding provides a level of knowledge between word/phrase level and sentence level, and we show that it can facilitate downstream tasks requiring relational understanding of the text. To learn such an embedding, we create the largest distant supervision dataset by linking the entire English ClueWeb09 corpus to Freebase. We use global co-occurrence statistics between textual and knowledge base relations as the supervision signal to train the embedding. Evaluation on two relational understanding tasks demonstrates the usefulness of the learned textual relation embedding. The data and code can be found at https://github.com/czyssrs/GloREPlus
Highlights
Pre-trained embeddings such as word embeddings (Mikolov et al, 2013; Pennington et al, 2014; Peters et al, 2018; Devlin et al, 2018) and sentence embeddings (Le and Mikolov, 2014; Kiros et al, 2015) have become fundamental NLP tools
If we can learn a general-purpose embedding for textual relations, it may facilitate many downstream relational understanding tasks by providing general relational knowledge
We propose the novel task of learning general-purpose embedding of textual relations, which has the potential to facilitate a wide range of relational understanding tasks
Summary
Pre-trained embeddings such as word embeddings (Mikolov et al, 2013; Pennington et al, 2014; Peters et al, 2018; Devlin et al, 2018) and sentence embeddings (Le and Mikolov, 2014; Kiros et al, 2015) have become fundamental NLP tools. Su et al (2018) propose to leverage global co-occurrence statistics of textual and KB relations to learn embeddings of textual relations, and show that it can effectively combat the wrong labeling problem of distant supervision (see Figure 1 for example). While their method, named GloRE, achieves the state-of-the-art performance on the popular New York Times (NYT) dataset (Riedel et al, 2010), the scope of their study is limited to relation extraction with smallscale in-domain training data. We use the global cooccurrence statistics as explicit supervision signal
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.