LINSPECTOR WEB: A Multilingual Probing Suite for Word Representations

Max Eichler,Iryna Gurevych,Gözde Gül Şahin

doi:10.18653/v1/d19-3022

Abstract

We present LINSPECTOR WEB , an open source multilingual inspector to analyze word representations. Our system provides researchers working in low-resource settings with an easily accessible web based probing tool to gain quick insights into their word embeddings especially outside of the English language. To do this we employ 16 simple linguistic probing tasks such as gender, case marking, and tense for a diverse set of 28 languages. We support probing of static word embeddings along with pretrained AllenNLP models that are commonly used for NLP downstream tasks such as named entity recognition, natural language inference and dependency parsing. The results are visualized in a polar chart and also provided as a table. LINSPECTOR WEB is available as an offline tool or at https://linspector.ukp.informatik.tu-darmstadt.de.

Highlights

Natural language processing (NLP) has seen great progress after the introduction of continuous, dense, low dimensional vectors to represent text
Datasets for either of those tasks do not exist for many languages, word similarity tests do not necessarily correlate well with downstream tasks and evaluating embeddings on downstream tasks can be too computationally demanding for low-resource scenarios
BiaffineDependencyParser and CrfTagger are highlighted as the default choice for dependency parsing and named entity recognition by (Gardner et al, 2018), while ESIM was picked as one of two available natural language inference models, and SimpleTagger support was added as the entry level AllenNLP classifier to solve tasks like partsof-speech tagging

Summary

Introduction

Natural language processing (NLP) has seen great progress after the introduction of continuous, dense, low dimensional vectors to represent text. The field has witnessed the creation of various word embedding models such as monolingual (Mikolov et al, 2013), contextualized (Peters et al, 2018), multi-sense (Pilehvar et al, 2017) and dependency-based (Levy and Goldberg, 2014); as well as adaptation and design of neural network architectures for a wide range of NLP tasks Despite their impressive performance, interpreting, analyzing and evaluating such black-box models have been shown to be challenging, which even led to a set of workshop series (Linzen et al, 2018). Unlike most studies, Kohn (2015) introduced a set of multilingual probing tasks, its scope has been limited to syntactic tests and 7 languages More importantly it is not accessible as a web application and the source code does not have support to probe pretrained downstream NLP models out of the box. To the best of our knowledge, this is the first web application that (a) performs online probing; (b) enables users to upload their pretrained downstream task models to automatically analyze different layers and epochs; and (c) has support for 28 languages with some of them being extremely low-resource such as Quechuan

Previous Systems

LINSPECTOR WEB

Scope of Probing

Features

Backend

Training Times

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

LINSPECTOR WEB: A Multilingual Probing Suite for Word Representations

Abstract

Highlights

Summary

Talk to us

Similar Papers

Lead the way for us

Publication Date: Jan 1, 2019
Citations: 9	License type: cc-by

Similar Papers

LINSPECTOR: Multilingual Probing Tasks for Word Representations
Gözde Gül Şahin ... Ilia Kuznetsov
Computational Linguistics | VOL. 46
Gözde Gül Şahin, et. al.Gözde Gül Şahin ... Ilia Kuznetsov
01 Jun 2020
Computational Linguistics | VOL. 46

Evaluation and Analysis of Word Embedding Vectors of English Text Using Deep Learning Technique
Jaspreet Singh ... Prithvipal Singh
-
Jaspreet Singh, et. al.Jaspreet Singh ... Prithvipal Singh
01 Jan 2018
01 Jan 2018

Improved biomedical word embeddings in the transformer era
Jiho Noh ... Ramakanth Kavuluru
Journal of Biomedical Informatics | VOL. 120
Jiho Noh, et. al.Jiho Noh ... Ramakanth Kavuluru
18 Jul 2021
Journal of Biomedical Informatics | VOL. 120

Named Entity Recognition Based on Dependency Parsing and BiLSTM-CRF
Sheping Zhai ... Yun Chai
-
Sheping Zhai, et. al.Sheping Zhai ... Yun Chai
01 Jan 2021
01 Jan 2021

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

LINSPECTOR WEB: A Multilingual Probing Suite for Word Representations

Abstract

Highlights

Summary

Talk to us

Similar Papers