Shapley Idioms: Analysing BERT Sentence Embeddings for General Idiom Token Identification.

Vasudevan Nedumpozhimana,John D. Kelleher,Filip Klubička

doi:10.3389/frai.2022.813967

Vasudevan Nedumpozhimana, John D. Kelleher + Show 1 more

Open Access

https://doi.org/10.3389/frai.2022.813967

Copy DOI

Abstract

This article examines the basis of Natural Language Understanding of transformer based language models, such as BERT. It does this through a case study on idiom token classification. We use idiom token identification as a basis for our analysis because of the variety of information types that have previously been explored in the literature for this task, including: topic, lexical, and syntactic features. This variety of relevant information types means that the task of idiom token identification enables us to explore the forms of linguistic information that a BERT language model captures and encodes in its representations. The core of this article presents three experiments. The first experiment analyzes the effectiveness of BERT sentence embeddings for creating a general idiom token identification model and the results indicate that the BERT sentence embeddings outperform Skip-Thought. In the second and third experiment we use the game theory concept of Shapley Values to rank the usefulness of individual idiomatic expressions for model training and use this ranking to analyse the type of information that the model finds useful. We find that a combination of idiom-intrinsic and topic-based properties contribute to an expression's usefulness in idiom token identification. Overall our results indicate that BERT efficiently encodes a variety of information from topic, through lexical and syntactic information. Based on these results we argue that notwithstanding recent criticisms of language model based semantics, the ability of BERT to efficiently encode a variety of linguistic information types does represent a significant step forward in natural language understanding.

Highlights

There is a large body of existing work that is focused on probing BERT representations
We confirm that distributed representations are suitable for this task, but we go beyond improving the state of the art and perform a variety of experiments designed to investigate what types of information BERT uses for the task, and in this way we explore what information is useful for creating a general idiom token identification model
Contributions: (a) we report a new state of the art for general idiom token identification, using BERT sentence embeddings; (b) we demonstrate that the game theory concept of Shapley values provides a basis for analysing idiomatic usage; and (c) we explain the strong performance of BERT embeddings in terms of their ability to model idiom-intrinsic and topic-based properties

Summary

INTRODUCTION

The last 5 years of natural language processing research has been a record of remarkable progress in the state-of-the-art across a range of tasks (Wang et al, 2019). In the intervening period more rich and advanced embedding techniques, such as BERT (Devlin et al, 2018), have since been developed and report vastly improved performance on many NLP tasks It is worth investigating the application of these new embedding models to the problem of general idiom token identification. We build on the work of Salton et al (2016) and look at how well a contemporary sentence embedding model performs on the task of general idiom token identification, on the example of English Verb-Noun Idiomatic Combinations (VNICs). Contributions: (a) we report a new state of the art for general idiom token identification, using BERT sentence embeddings; (b) we demonstrate that the game theory concept of Shapley values provides a basis for analysing idiomatic usage; and (c) we explain the strong performance of BERT embeddings in terms of their ability to model idiom-intrinsic and topic-based properties

RELATED WORK

GENERAL IDIOM TOKEN IDENTIFICATION MODEL AND COMPARISON WITH STATE OF THE ART

ANALYSIS OF IDIOMATIC EXPRESSIONS

Shapley Value Analysis

Why Are Some Idioms More Useful?

Fixedness

Topic Distributional Similarity

Idiom Literal Divergence

Dataset Properties

Generalizability of the Model

CONCLUSION

Findings

DATA AVAILABILITY STATEMENT

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Frontiers in artificial intelligence	Publication Date: Mar 14, 2022
Citations: 3	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Shapley Idioms: Analysing BERT Sentence Embeddings for General Idiom Token Identification.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Frontiers in artificial intelligence

Lead the way for us

Similar Papers

Rescoring effectiveness of language models using different levels of knowledge and their integration
Wen Wang ... Yang Liu
-
Wen Wang, et. al.Wen Wang ... Yang Liu
01 May 2002
01 May 2002

Information-Restricted Neural Language Models Reveal Different Brain Regions' Sensitivity to Semantics, Syntax, and Context.
Alexandre Pasquiou ... Bertrand Thirion
Neurobiology of language (Cambridge, Mass.) | VOL. 4
Alexandre Pasquiou, et. al.Alexandre Pasquiou ... Bertrand Thirion
14 Dec 2023
Neurobiology of language (Cambridge, Mass.) | VOL. 4

Improving Spoken Language Understanding by Enhancing Text Representation
Thai Binh Nguyen
-
Thai Binh NguyenThai Binh Nguyen
23 May 2022
23 May 2022

Phoneme based Domain Prediction for Language Model Adaptation
Anmol Bhasin ... Gaurav Mathur
-
Anmol Bhasin, et. al.Anmol Bhasin ... Gaurav Mathur
01 Jul 2020
01 Jul 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Shapley Idioms: Analysing BERT Sentence Embeddings for General Idiom Token Identification.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Frontiers in artificial intelligence