Part-of-Speech Tagging Using Multiview Learning

Kyungtae Lim,Jungyeul Park

doi:10.1109/access.2020.3033979

Kyungtae Lim, Jungyeul Park

Open Access

PDF Available

https://doi.org/10.1109/access.2020.3033979

Copy DOI

Export

Save

Cite

Abstract
Highlights/Summary
Full-Text PDF
Similar Papers

Abstract

Listen

In natural language processing, character-level representations are vector representations of the particular character. Character-level representations have recently focused on enriching subword information by stacking deep neural models. Ideally, applications of several character-level representations can help capture different aspects of the subword information. However, this approach has often failed in the past, mainly because of the nature of traditionally used simple concatenation models. In this study, we explore different character-level modeling techniques. During the learning process, long short-term memory-based character representations can introduce different views for a part-of-speech tagger. After investigating two previously reported techniques, we propose two additional extended methods: (1) a multihead-attention character-level representation for capturing several aspects of subword information, and (2) an optimal structure for training two different character-level embeddings based on joint learning. We evaluate our results on the part-of-speech (POS) tagging dataset of the Conference on Natural Language Learning (CoNLL) 2018 shared task in universal dependencies. We show that our method substantially improves POS tagging results for many morphologically rich languages where the character information should be considered more substantially. Moreover, we compare the performance of our model with recent state-of-the-art POS taggers, which are trained with language models such as Bidirectional Encoder Representations from Transformers (BERT) and Deep Contextualized Word Representations (ELMo); our multiview tagger shows better results for nine languages. The proposed character model shows significant improvements in Ancient Greek, with average gains of 8.89 points in accuracy compared to the previous word representation model. Therefore, our empirical experiments indicate that character-level representations are more important than word representations for morphologically rich languages in terms of performance.

Highlights

N ATURAL language processing (NLP) has been focused on English and a few other languages that were economically profitable
The objective of the Conference on Natural Language Learning (CoNLL) 2018 ST to evaluate POS tagging and dependency parsing by following a real-world setting that starts from raw texts over 57 languages
DATASET We evaluate our model on the Universal Dependency (UD) 2.2 corpora provided for the CoNLL 2018 ST [31]

Summary

INTRODUCTION

N ATURAL language processing (NLP) has been focused on English and a few other languages that were economically (or more rarely, strategically) profitable. Combining different word representations at the character, token, or subword levels has proven to be helpful for dependency parsing [2]–[4] and other NLP tasks as well as POS tagging [5]–[7]. Studies on character models have focused on enriching feature representations by stacking more neural layers [21], applying an attention mechanism [17], and appending a multilayer perceptron (MLP) to the output of recurrent networks [7] This approach has obtained the best performance for POS tagging and dependency parsing in CoNLL 2017 and 2018 ST datasets [22], [23]. We combine two different character embeddings: a context-independent word-based character representation [21] and a context-sensitive sentence-based character representation [24], [25]

BASIC NOTION OF THE CHARACTER MODEL

CONTEXT INSENSITIVE WORD-BASED CHARACTER MODEL

DEEP CONTEXTUALIZED MULTIVIEW POS TAGGER

TWO TAGGERS FROM CHARACTER MODELS

JOINT POS TAGGER

EXPERIMENTS AND RESULTS

RESULTS

INTJ SYM

CONCLUSION

Full Text

Published Version (Free)

View/Download pdf

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Access	Publication Date: Jan 1, 2020
Citations: 1	License type: CC BY 4.0

R Discovery Prime

Part-of-Speech Tagging Using Multiview Learning

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers

More From: IEEE Access

Lead the way for us

Similar Papers

Bidirectional encoders to state-of-the-art: a review of BERT and its transformative impact on natural language processing
Rajesh Gupta
Информатика. Экономика. Управление - Informatics. Economics. Management | VOL. 3
Rajesh GuptaRajesh Gupta
02 Mar 2024
Информатика. Экономика. Управление - Informatics. Economics. Management | VOL. 3

Using Character-Level and Entity-Level Representations to Enhance Bidirectional Encoder Representation From Transformers-Based Clinical Semantic Textual Similarity Model: ClinicalSTS Modeling Study.
Ying Xiong ... Shuai Chen
JMIR Medical Informatics | VOL. 8
Ying Xiong, et. al.Ying Xiong ... Shuai Chen
29 Dec 2020
JMIR Medical Informatics | VOL. 8

An ERNIE-Based Joint Model for Chinese Named Entity Recognition
Yu Wang ... Zuchang Ma
Applied Sciences | VOL. 10
Yu Wang, et. al.Yu Wang ... Zuchang Ma
18 Aug 2020
Applied Sciences | VOL. 10

Classification of Fire Related Tweets on Twitter Using Bidirectional Encoder Representations from Transformers (BERT)
Jairus Mingua ... Evan Joy Celino
-
Jairus Mingua, et. al.Jairus Mingua ... Evan Joy Celino
28 Nov 2021
28 Nov 2021

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

Part-of-Speech Tagging Using Multiview Learning

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers

More From: IEEE Access