Optimizing segmentation granularity for neural machine translation

Elizabeth Salesky,Graham Neubig,Alex Coda,Jan Niehues,Andrew Runge

doi:10.1007/s10590-019-09243-8

Abstract

In neural machine translation (NMT), it has become standard to translate using subword units to allow for an open vocabulary and improve accuracy on infrequent words. Byte-pair encoding (BPE) and its variants are the predominant approach to generating these subwords, as they are unsupervised, resource-free, and empirically effective. However, the granularity of these subword units is a hyperparameter to be tuned for each language and task, using methods such as grid search. Tuning may be done inexhaustively or skipped entirely due to resource constraints, leading to sub-optimal performance. In this paper, we propose a method to automatically tune this parameter using only one training pass. We incrementally introduce new BPE vocabulary online based on the held-out validation loss, beginning with smaller, general subwords and adding larger, more specific units over the course of training. Our method matches the results found with grid search, optimizing segmentation granularity while significantly reducing overall training time. We also show benefits in training efficiency and performance improvements for rare words due to the way embeddings for larger units are incrementally constructed by combining those from smaller units.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Optimizing segmentation granularity for neural machine translation

Abstract

Talk to us

Similar Papers

More From: Machine Translation

Lead the way for us

Journal: Machine Translation	Publication Date: Jan 24, 2020
Citations: 28

Similar Papers

Arabic–Chinese Neural Machine Translation: Romanized Arabic as Subword Unit for Arabic-sourced Translation
Fares Aqlan ... Akram Al-Mansoub
IEEE Access | VOL. 7
Fares Aqlan, et. al.Fares Aqlan ... Akram Al-Mansoub
01 Jan 2019
IEEE Access | VOL. 7

Effect of linguistic information in neural machine translation
Naomichi Nakamura ... Hitoshi Isahara
-
Naomichi Nakamura, et. al.Naomichi Nakamura ... Hitoshi Isahara
01 Aug 2017
01 Aug 2017

Experience of neural machine translation between Indian languages
Shubham Dewangan ... Pushpak Bhattacharyya
Machine Translation | VOL. 35
Shubham Dewangan, et. al.Shubham Dewangan ... Pushpak Bhattacharyya
01 Apr 2021
Machine Translation | VOL. 35

A Statistical Extension of Byte-Pair Encoding
David Vilar ... Marcello Federico
-
David Vilar, et. al.David Vilar ... Marcello Federico
01 Jan 2020
01 Jan 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Optimizing segmentation granularity for neural machine translation

Abstract

Talk to us

Similar Papers

More From: Machine Translation