Lexical Simplification by Unsupervised Machine Translation

Akihiro Katsuta,Kazuhide Yamamoto

doi:10.1142/s2717554520500083

Abstract

In recent years, simple Japanese has been attracting attention as information transmission for foreigners. Automatic text simplification aims to reduce the complexity of vocabulary and expressions in a sentence while retaining its original meaning. This paper aims at compressing vocabulary, focusing on lexical simplification. Since the construction or expansion of a simplification corpus is very costly, we construct a simplification model by unsupervised learning that does not require a parallel corpus for simplification. We construct a simplification model that does not require a parallel corpus using Unsupervised Statistical Machine Translation. Based on a predetermined vocabulary, a pseudo-corpus for simplification is constructed from a web corpus and we learn the simplification model by the pseudo-corpus. We only need a vocabulary and a plain text corpus to train the simplification model. Moreover, we propose to clean the phrase table by WordNet, which improves the performance in BLEU and SARI metrics. By suppressing distant paraphrasing with WordNet, it became easier to select the correct paraphrase candidate.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Lexical Simplification by Unsupervised Machine Translation

Abstract

Talk to us

Similar Papers

More From: International Journal of Asian Language Processing

Lead the way for us

Journal: International Journal of Asian Language Processing	Publication Date: Jun 1, 2020
Citations: 3

Similar Papers

Enhanced unsupervised neural machine translation by cross lingual sense embedding and filtered back-translation for morphological and endangered Indic languages
Shweta Chauhan ... Philemon Daniel
Journal of Experimental & Theoretical Artificial Intelligence | VOL. 36
Shweta Chauhan, et. al.Shweta Chauhan ... Philemon Daniel
03 Nov 2022
Journal of Experimental & Theoretical Artificial Intelligence | VOL. 36

Language Model Pre-training Method in Machine Translation Based on Named Entity Recognition
Zhen Li ... Chaojie Xie
International Journal on Artificial Intelligence Tools | VOL. 29
Zhen Li, et. al.Zhen Li ... Chaojie Xie
30 Nov 2020
International Journal on Artificial Intelligence Tools | VOL. 29

Mongolian-Chinese Unsupervised Neural Machine Translation with Lexical Feature
Ziyu Wu ... Ziyue Guo
-
Ziyu Wu, et. al.Ziyu Wu ... Ziyue Guo
01 Jan 2019
01 Jan 2019

Unsupervised Neural Machine Translation with Universal Grammar
Zuchao Li ... Hai Zhao
-
Zuchao Li, et. al.Zuchao Li ... Hai Zhao
01 Jan 2020
01 Jan 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Lexical Simplification by Unsupervised Machine Translation

Abstract

Talk to us

Similar Papers

More From: International Journal of Asian Language Processing