Robust Multi-task Learning-based Korean POS Tagging to Overcome Word Spacing Errors

Cheoneum Park,Juae Kim

doi:10.1145/3591206

Abstract

End-to-end neural network-based approaches have recently demonstrated significant improvements in natural language processing (NLP). However, in the NLP application such as assistant systems, NLP components are still processed to extract results using a pipeline paradigm. The pipeline-based concept has issues with error propagation. In Korean, morphological analysis and part-of-speech (POS) tagging step, incorrectly analyzing POS tags for a sentence containing spacing errors negatively affects other modules behind the POS module. Hence, we present a multi-task learning-based POS tagging neural model for Korean with word spacing challenges. When we apply this model to the Korean morphological analysis and POS tagging, we get findings that are robust to word spacing errors. We adopt syllable-level input and output formats, as well as a simple structure for ELECTRA and RNN-CRF models for multi-task learning, and we achieve a good performance 98.30 of F1, better than previous studies on the Sejong corpus test set.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Robust Multi-task Learning-based Korean POS Tagging to Overcome Word Spacing Errors

Abstract

Talk to us

Similar Papers

More From: ACM Transactions on Asian and Low-Resource Language Information Processing

Lead the way for us

Similar Papers

Hidden Markov Model based Part of Speech Tagging for Nepali language
Abhijit Paul ... Bipul Syam Purkayastha
-
Abhijit Paul, et. al.Abhijit Paul ... Bipul Syam Purkayastha
01 Sep 2015
01 Sep 2015

Combination of Genetic Algorithm and Brill Tagger Algorithm for Part of Speech Tagging Bahasa Madura
Nindian Puspa Dewi ... Ubaidi Ubaidi
Proceeding of the Electrical Engineering Computer Science and Informatics | VOL. 7
Nindian Puspa Dewi, et. al.Nindian Puspa Dewi ... Ubaidi Ubaidi
01 Oct 2020
Proceeding of the Electrical Engineering Computer Science and Informatics | VOL. 7

Implementation of Kadazan Tagger Based on Brill's Method
Marylyn Alex ... Lailatul Qadri Zakaria
Journal of ICT Research and Applications | VOL. 7
Marylyn Alex, et. al.Marylyn Alex ... Lailatul Qadri Zakaria
01 Dec 2013
Journal of ICT Research and Applications | VOL. 7

A REVIEW ON DIFFERENT APPROACHES OF POS TAGGING IN NLP
K Aparna ... Pooja Bhakta
-
K Aparna, et. al. K Aparna ... Pooja Bhakta
01 Jan 2020
01 Jan 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Robust Multi-task Learning-based Korean POS Tagging to Overcome Word Spacing Errors

Abstract

Talk to us

Similar Papers

More From: ACM Transactions on Asian and Low-Resource Language Information Processing