Visually Analyzing Contextualized Embeddings

Matthew Berger

doi:10.1109/vis47514.2020.00062

Abstract

In this paper we introduce a method for visually analyzing contextualized embeddings produced by deep neural network-based language models. Our approach is inspired by linguistic probes for natural language processing, where tasks are designed to probe language models for linguistic structure, such as parts-of-speech and named entities. These approaches are largely confirmatory, however, only enabling a user to test for information known a priori. In this work, we eschew supervised probing tasks, and advocate for unsupervised probes, coupled with visual exploration techniques, to assess what is learned by language models. Specifically, we cluster contextualized embeddings produced from a large text corpus, and introduce a visualization design based on this clustering and textual structure – cluster co-occurrences, cluster spans, and cluster-word membership– to help elicit the functionality of, and relationship between, individual clusters. User feedback highlights the benefits of our design in discovering different types of linguistic structures.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Visually Analyzing Contextualized Embeddings

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Methodical Systematic Review of Abstractive Summarization and Natural Language Processing Models for Biomedical Health Informatics: Approaches, Metrics and Challenges
Praveen Kumar Katwe ... Deepak Gupta
ACM Transactions on Asian and Low-Resource Language Information Processing | VOL. -
Praveen Kumar Katwe, et. al.Praveen Kumar Katwe ... Deepak Gupta
31 May 2023
ACM Transactions on Asian and Low-Resource Language Information Processing | VOL. -

Detecting autism from picture book narratives using deep neural utterance embeddings.
Aleksander Wawer ... Izabela Chojnicka
International Journal of Language & Communication Disorders | VOL. 57
Aleksander Wawer, et. al.Aleksander Wawer ... Izabela Chojnicka
12 May 2022
International Journal of Language & Communication Disorders | VOL. 57

Neural network-based clustering model of ischemic stroke patients with a maximally distinct distribution of 1-year vascular outcomes.
Joon-Tae Kim ... Nu Ri Kim
Scientific Reports | VOL. 12
Joon-Tae Kim, et. al.Joon-Tae Kim ... Nu Ri Kim
08 Jun 2022
Scientific Reports | VOL. 12

Grammatical versus Spelling Error Correction: An Investigation into the Responsiveness of Transformer-Based Language Models Using BART and MarianMT
Rohit Raju ... Sa Gandheesh
Journal of Information & Knowledge Management | VOL. -
Rohit Raju, et. al.Rohit Raju ... Sa Gandheesh
21 Mar 2024
Journal of Information & Knowledge Management | VOL. -

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Visually Analyzing Contextualized Embeddings

Abstract

Talk to us

Similar Papers