The POLAR Framework: Polar Opposites Enable Interpretability of Pre-Trained Word Embeddings

Binny Mathew,Florian Lemmerich,Markus Strohmaier,Sandipan Sikdar

doi:10.1145/3366423.3380227

Binny Mathew, Florian Lemmerich + Show 2 more

Open Access

https://doi.org/10.1145/3366423.3380227

Copy DOI

Abstract

We introduce ‘POLAR’ — a framework that adds interpretability to pre-trained word embeddings via the adoption of semantic differentials. Semantic differentials are a psychometric construct for measuring the semantics of a word by analysing its position on a scale between two polar opposites (e.g., cold – hot, soft – hard). The core idea of our approach is to transform existing, pre-trained word embeddings via semantic differentials to a new “polar” space with interpretable dimensions defined by such polar opposites. Our framework also allows for selecting the most discriminative dimensions from a set of polar dimensions provided by an oracle, i.e., an external source. We demonstrate the effectiveness of our framework by deploying it to various downstream tasks, in which our interpretable word embeddings achieve a performance that is comparable to the original word embeddings. We also show that the interpretable dimensions selected by our framework align with human judgement. Together, these results demonstrate that interpretability can be added to word embeddings without compromising performance. Our work is relevant for researchers and engineers interested in interpreting pre-trained word embeddings.

Highlights

Dense distributed word representations such as Word2Vec [21] and GloVe [27] have been established as a key step for technical solutions for a wide variety of natural language processing tasks including translation [44], sentiment analysis [36], and image captioning [43]
Our method is able to capture multiple interpretations of the words. This demonstrates that presented a novel framework (POLAR) seems to be able to produce interpretable dimensions which are easy for humans to recognize
We utilized the concept of Semantic Differential from psychometrics to transform pre-trained word embeddings into interpretable word embeddings

Summary

Introduction

Dense distributed word representations such as Word2Vec [21] and GloVe [27] have been established as a key step for technical solutions for a wide variety of natural language processing tasks including translation [44], sentiment analysis [36], and image captioning [43]. While such word representations have substantially contributed towards improving performance of such tasks, it is usually difficult for humans to make sense of them. Our results are robust across different embedding algorithms This demonstrates that we can augment word embeddings with interpretability without much loss of performance across a range of tasks. This allows the method to scale up to very large corpus

Objectives

Methods

Results

Discussion

Conclusion