Affordance Extraction and Inference based on Semantic Role Labeling

Daniel Loureiro,Alípio Jorge

doi:10.18653/v1/w18-5514

Abstract

Common-sense reasoning is becoming increasingly important for the advancement of Natural Language Processing. While word embeddings have been very successful, they cannot explain which aspects of ‘coffee’ and ‘tea’ make them similar, or how they could be related to ‘shop’. In this paper, we propose an explicit word representation that builds upon the Distributional Hypothesis to represent meaning from semantic roles, and allow inference of relations from their meshing, as supported by the affordance-based Indexical Hypothesis. We find that our model improves the state-of-the-art on unsupervised word similarity tasks while allowing for direct inference of new relations from the same vector space.

Highlights

The word representations used more recently in Natural Language Processing (NLP) have been based on the Distributional Hypothesis (DH) (Harris, 1954) — “words that occur in the same contexts tend to have similar meanings”
Our word representations are modelled using Predicate-Argument Structures (PASs). These structures are obtained through Semantic Role Labeling (SRL) of raw corpora, and used to populate a sparse word/context co-occurrence matrix W where roles serve as contexts, and argument spans serve as the co-occurrence windows
The explicit nature of the representations produced by our model makes them directly interpretable, to other sparse representations such as Faruqui and Dyer (2015b)

Summary

Introduction

The word representations used more recently in Natural Language Processing (NLP) have been based on the Distributional Hypothesis (DH) (Harris, 1954) — “words that occur in the same contexts tend to have similar meanings”. This simple idea has led to the development of powerful word embedding models, starting with Latent Semantic Analysis (LSA) (Landauer and Dumais, 1997) and later, the popular word2vec (Mikolov et al, 2013) and GloVe (Pennington et al, 2014) models. While there have been substantial improvements to word embedding models over the years, these shortcomings have endured (Camacho-Collados and Pilehvar, 2018).

Methods

Results

Conclusion